2012年6月1日星期五

那啥简单的Sexp解析

* 简单的Sexp解析

嘛,没啥东西可贴,就找找以前有没有什么好玩的东西出来吧 >_<
这个就是简单地把那种lisp括号表达式在Python中解析成字符串组成的元组。
能解析的格式被简化的很厉害,不仅是为了实现简单,也是确实让代码可以更简单一些。
用的LL的递归下降对字符解析,考虑到Python3有流IO,用那个会更好一些。
#! /usr/bin/env python26
# coding: utf-8

class SSexpr:
    """
    A Simple Simplified Sexpr Reader.
    @Author ee.zsy
    @Date Mar.2012
    type ssexpr = Symbol of string | Block of ssexpr list;
    function parse : string - > ssexpr;
    >>> parse(r'(1 2 "3\" 4"( 1 2 3))')
    ['1', '2', '3" 4', ['1', '2', '3']]
    Notice that Python evaluates expressions from left to right.
    """
    w = " \n\t"
    e = " \n\t()[]"
    b = (lambda i:lambda s,x:i(x).pop())(set(')]').difference)
    m = 'Brackets or Parentheses do not match.'
    a = dict(zip('\"\'nt\\','\"\'\n\t\\'))
    f = 'Escape character is not recognized.'
    def __init__(s,code):
        s.s, s.p = code, 0
    def read(s):
        try:return s.exp()
        except RuntimeError as e:raise SyntaxError(e)
    @classmethod
    def parse(c,code):
        return c(code).read()
    def peek(s):
        return s.s[s.p] if s.p<len(s.s) else None
    def eat(s):
        s.p += 1;return s.s[s.p-1]
    def exp(s,cm=0):
        c = s.peek()
        if c is None:return None
        elif c in s.w:return s.eat() and s.exp()
        elif c == '(':return s.eat() and s.blk(')')
        elif c == '[':return s.eat() and s.blk(']')
        elif c == '"':return s.eat() and s.str()
        else:return s.sym()
    def sym(s):
        c = s.peek()
        if c is None or c in s.e:return ""
        else:return s.eat() and c+s.sym()
    def str(s):
        c = s.peek()
        if c is None:raise EOFError
        elif c == '"':return s.eat() and ''
        elif c =='\\':return s.eat() and s.esc(s.eat())+s.str()
        else:return s.eat() and c+s.str()
    def esc(s,c):
        if c in s.a:return s.a[c]
        else:raise SyntaxError(s.f)
    def blk(s,end):
        c = s.peek()
        if c is None:raise EOFError
        elif c == s.b(end):raise SyntaxError(s.m)
        elif c in s.w:return s.eat() and s.blk(end)
        elif c == end:return s.eat() and tuple()
        else:return tuple([s.exp()])+s.blk(end)
        
if 1:
    assert SSexpr("""123 12""").read() == '123'
    assert SSexpr("""(1 2 3 4(1 2 3))""").read() == ('1', '2', '3', '4', ('1', '2', '3'))
    assert SSexpr("""(1 2 "3 4"( 1 2 3))""").read() == ('1', '2', '3 4', ('1', '2', '3'))
    assert SSexpr.parse(r"""(1 2 "3\" 4"( 1 2 3))""") == ('1', '2', '3" 4', ('1', '2', '3'))
    assert SSexpr.parse(r"""[]""") == tuple()

if __name__ == '__main__':
    print SSexpr("""[hello world]""").read()
    while 1:
        print SSexpr(raw_input()).read()



再往前的一个功能略微多一点点的版本,有单独的正则表达式实现的词法解析部分。

#@Date Nov.27,2011
#@Author ee.zsy
import re
pattern = r'''
 (?P<lpr>\()
|(?P<rpr>\))
|(?P<qte>\')
|(?P<qqt>`)
|(?P<uqs>,@)
|(?P<uqt>,)
|(?P<blt>\#[tT])
|(?P<blf>\#[fF])
|(?P<chr>\#\\.[^ \)\t\n\r]*)
|(?P<shr>\#)
|(?P<num>[-+]?(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?)
|(?P<str>\"(\\.|[^"])*\")
|(?P<wht>([ \t\n\r]+)|(;.*?[\n\r]))
|(?P<sym>[^ ;\)\(\t\n\r]+)
'''
token = re.compile(pattern,re.S|re.X)
def peekToken(text,start=0):
    regex = token.match(text,start)
    if regex:
        tag = [(k,v) for k,v in regex.groupdict().items() if v]
        return tag[0],regex.end()
    return None,None


然后余下的语法解析还是一样的了,只是情况相比要多一些。
不过代码代码不像这里这样简单的看的那么清爽,这里就不贴了。
虽然这里的东西都故意做了简化,但是改动之后倒是可以做表达式计算或者类AWK的解析什么的。
手写LL(1)的时候,有些左连接的操作符,要把循环的部分单独写一条规则出来。
比如像“num "+" num {"+" num}”这样的,要保证可以只根据下一个未处理的Token作出分支。
这里的这种情况有个专门的名词叫做什么什么来着的,囧死了人家都不知道了。

话说内部接口封装不够不可靠都很讨厌。


* Stack machine

这里先假设操作符都是二元的,那么最简单的一个前缀转后缀可以这么实现。
(define e '(+ (* 3 (- 2 3)) (+ 2 3)))
(define (t e) (if (pair? e) (let ((x (map t e))) (apply append (append (cdr x) (list (car x))))) (list e)))
(display (t e))

那么接下来,那么接下来就以后再写吧。

2012年5月29日星期二

关于AWK的絮絮叨叨

* 没啥特别重要的知道的可说的东西。

* HelloWorld首先:awk 'BEGIN{print "helloworld"}'

* 写在文件里,开头用 #!/bin/awk -f

* 变量不声明即有默认值,根据类型是0或空字符串。变量是全局的,局部通过函数的参数的形式来实现。

* awk可用来处理文本数据,和排版输出文本。

* 默认以一行为一条记录,$1 $2 ...依次为每行的字段(共NF个)。分割行和字段的方式可以自定义,文本的处理过程中是以行为单位触发处理事件的。

* 记录用正则FS分割,当前行号是NR

* 变量x用“x+0”转为数字,用“x ""”转换为字符串,比较运算符当有一侧为字符串时,都自动转换为字符串。

* 识别的方式可以在识别过程中修改。

* 有一堆系统变量可以读取和设置。

* awk文件由多个“pattern { action }”构成,当模式的条件匹配后,会执行对应的命令。

* 在规则中next和exit仅作用于当前行的匹配。

* 模式可以是空或BEGIN/END或正则表达式或条件语句

* 关联数组是字符串到变量值的映射,in操作符可用于if判断和for循环,具体使用方法照着示例代码用即可。

* 多维数组,即C语言中数组的数组,是用SUBSEP拼接下标。

* man 可参阅 http://www.manpagez.com/man/1/awk/ 。

* man里有教程、用法、示例和参考列表。

* 自带里数学/字符串/数组/IO/系统方面的一些函数。

* 字符串连接是空格,逗号是分隔符。

* 自定义的函数不可以复制给变量。

* 基本语法和C和bc都差不多。

* 正则表达式扮演了很重要的角色,比perl的正则功能上简单。

* 交互接受用户输入可以用“getline input "-"”,然后使用input变量即可(留空则$0)。

* 输出有print语句和printf函数,可输出到屏幕或文件。

* 可以干bc/sed/shell可以干的活,可以放在shell里面干活,可配合管道(getline,print,system)使用。

* 执行命令或输出到文件,相同字符串的所代表可以一直使用,直到close它。例如“while(getline<"file">0)”是从文件不断读取行。

* 输出到错误输出使用“print message I "cat 1>&2"”。

* 命令行的参数ARGC、ARGV需在BEGIN中处理,之后将被当作是读入的文件名。

* the awk language里有实现若干排序算法和对awk文法进行解析的例子。

* 若干常见实例参考::《SED 单行脚本快速参考》的 awk 实现 http://linuxtoy.org/archives/sed-awk.html 。

* mawk在windows下可用。

* ideone.com里也有awk可玩。

* 写python然后“python -c 'for i in file("a.txt"):print i'”作用和“awk '{print $0}' a.txt”差不多。

* 不论手段,在更高效的前提下,把文本处理完就行。

* 有时候不用来处理文本也行。

* 不用awk也行。

* 很多情况用文本编辑器也行。

* 若干示例
+ 显示文件:awk '{print}'
+ 加行号:awk '{print NR,$0}'
+ 去行号:awk '{sub(/^[0-9]+ /,"");print}'
+ 统计行数:awk 'END{print NR}'
+ 排序:awk '{A[NR]=$0}END{asort(x);for(i=1;i<=NR;i++)print A[ i]}'
+ 计算均值:awk '{S+=$0}END{print S/NR}'
+ 另一种均值写法:awk 'BEGIN{for(i=1;i

2012年5月23日星期三

代码小练习三 - 图

继续代码小练习系列,理念是写出简单有趣的代码,虽然践行得有点失败的说。



* 图的广度优先搜索


代码很粗暴,可以用来走走迷宫求求无权图的最短路径。
此外把Queue换成Stack就是深度优先搜索了,递归也行。


type graph = int list*(int*int) list
let g = ([1;2;3;4],[(1,2);(2,3);(3,4);(3,1);(2,4)])
let bfs (g:graph) s =
    let q = Queue.create() in
    let module S = Set.Make(struct type t = int let compare = compare end) in
    let set = ref (S.add s S.empty) in
    let rec loop c = begin
        print_int c;
        List.iter
            (fun (x,y)->if x==c && not (S.mem y !set) then begin
                (set := (S.add y (!set))); Queue.push y q end)
            (match g with _,edges->edges);
        if Queue.is_empty q then () else loop (Queue.pop q);
    end in loop s;;
bfs g 1;;






* 有权无向连通图的最小生成树


MinimumSpanningTrees就是用那啥Kruskal和Prim,往生成树中贪婪地添加最优边。
关于所添加的最优的边的选择,前者是图中权最小的边,后者是生成树相邻的边。


type weight=int
 and vertex=int
 and edge=vertex*vertex*weight
 and graph=vertex list*edge list
let g = ([1;2;3;4],[(1,2,2);(2,3,3);(3,4,2);(1,3,4);(2,4,2)])
let mst_kruskal g=
    let s = Hashtbl.create 0 in
    let rec find_set (x:vertex) =
        if x=(try (Hashtbl.find s x) with Not_found->x)
        then x else find_set (Hashtbl.find s x) in
    let union x y = Hashtbl.replace s (find_set x) (find_set y) in
    let module P = Set.Make(
        struct type t = vertex*vertex let compare = compare end) in
    let a = ref P.empty in
    let edges = List.sort (fun (_,_,w1) (_,_,w2)->w1-w2) (snd g) in
    List.iter (fun (x,y,z)->if find_set(x)<>find_set(y) then
        begin union x y; a := (P.add (x,y) (!a)) end) edges;
    P.iter (fun (x,y)->begin Printf.printf "\\/(%d,%d)" x y end) !a;
;;
mst_kruskal g;;



g = ([1,2,3,4],[(1,2,2),(2,3,3),(3,4,2),(1,3,4),(2,4,2)])
def mst_prim(g,r):
    k,p,q,w={r:0},{},g[0][:],dict(sum([[[(x,y),z],[(y,x),z]] for x,y,z in g[1]],[]))
    while q: u = min(q,key=lambda x:k.get(x,float('+inf'))); q.remove(u)
        for v in set([y for x,y,_ in g[1] if x==u]+[x for x,y,_ in g[1] if y==u]):
            if v in q and w[(u,v)]<k.get(v,float('+inf')):
                k[v], p[v] = w[(u,v)], u
    return p
print mst_prim(g,1)





* 带权有向图的单源最短路径


Single-Source Shortest Paths可从起点按照距离动态规划。
Dijkstra可以得到从单一起点到其他所有顶点的距离及路径。
Bellman-Ford当负权边存在时路径若经过负圈则路径不存在。


g = ([1,2,3,4],[(1,2,2),(2,3,3),(3,4,2),(1,3,4),(2,4,2)])
def dijkstra(g,s,t):
    q, w = g[0][:], {(x,y):z for x,y,z in g[1]}
    d, p = {i:0 if i==s else float('+inf') for i in g[0]}, {}
    while q:
        u = min(q,key=lambda x:d[x]); q.remove(u)
        if u==t:
            f = lambda i:[i]+([] if i==s else f(p[i]))
            return list(reversed(f(t)))
        for v in {y for x,y,_ in g[1] if x==u}:
            if d[v] > d[u] + w[(u,v)]:
                d[v], p[v] = d[u]+w[(u,v)], u
    return None
print dijkstra(g,1,4)



g2 = ([1,2,3,4],[(1,2,2),(2,3,3),(3,4,2),(3,1,-9),(2,4,2)])
g3 = ([1,2,3,4],[(1,2,2),(2,3,3),(3,4,2),(3,1,-4),(2,4,2)])
def bellman_ford(g,s,t):
    d, p = {i:0 if i==s else float('+inf') for i in g[0]}, {}
    for i in range(len(g[0])-1):
        for u,v,w in g[1]:
            if d[v] > d[u] + w:
                d[v], p[v] = d[u]+w, u
    if any(d[v] > d[u] + w for u,v,w in g[1]):
        return None
    f = lambda i:[i]+([] if i==s else f(p[i]))
    return list(reversed(f(t)))
print bellman_ford(g,1,4)
print bellman_ford(g2,1,4)
print bellman_ford(g3,1,4)


好吧,咱承认是在对着书敲的代码,脑子已经不大清楚了囧。
这里的Dijkstra改改就是Prim,虽然作用不同代码结构一样。





* 任意两点间的最短路径


All-Pairs Shortest Paths,用Floyd-Warshall算法动态规划 ,一次得到图中任意两点间距离。

g = ([1,2,3,4],[(1,2,2),(2,3,3),(3,4,2),(3,1,4),(2,4,2)])
def floyd_warshall(g):
    n = len(g[0])
    d = [[float('+inf')]*n for i in g[0]]
    t = {v:k for k,v in enumerate(g[0])}
    for u,v,w in g[1]:
        d[t[u]][t[v]] = min(d[t[u]][t[v]],w)
    for k in range(n):
        for i in range(n):
            for j in range(n):
                d[i][j] = min(d[i][j],d[i][k]+d[k][j])
    return [(u,v,d[t[u]][t[v]]) for u in g[0] for v in g[0]]
print "\n".join("%s->%s : %s"%(u,v,w) for u,v,w in floyd_warshall(g))





* 网络流的最大流


Maximum Flow,可以通过不断添加流量到增广路上(Augmenting Path)计算。
其中增广路通过广度优先搜索在剩余网络(Residual Network)上查找得到。


g = ([1,2,3,4],[(1,2,2),(2,3,3),(3,4,2),(3,1,4),(2,4,2)])
g2 = ([1,2,3,4],[(1,2,3),(2,3,3),(3,4,2),(3,1,4),(2,4,2)])
g3 = ([1,2,3,4],[(1,2,5),(2,3,3),(3,4,2),(3,1,4),(2,4,2)])
g4 = (['s','o','p','q','r','t'],[('s','o',3),('s','p',3),('o','p',2),('o','q',3),('p','r',2),('r','t',3),('q','r',4),('q','t',2)])
def edmonds_karp(g,s,t):
    w = dict(sum(([[(u,v),w],[(v,u),0]] for u,v,w in g[1]),[]))
    f = {i:0 for i in w.keys()}
    def bfs():
        q, u, p = [], s, []
        while u!=t:
            q.extend(p+[e] for e in w.keys() if e[0]==u and e not in p and f[e]<w[e])
            if not q: return None
            p = q.pop(0)
            u = p[-1][1]
        return p
    for p in iter(bfs,None):
        c = min(w[e]-f[e] for e in p)
        for u,v in p:
            f[(u,v)] += c
            f[(v,u)] -= c
    return sum(f[e] for e in w.keys() if e[0]==s)
print edmonds_karp(g,1,4)
print edmonds_karp(g2,1,4)
print edmonds_karp(g3,1,4)
print edmonds_karp(g4,'s','t')


这里没理解错什么吧,有些细节还未考虑过,感觉有不大对劲。。




* A*搜索与拓扑排序

2012年5月21日星期一

代码小练习二 - 优先队列

继续代码小练习,优先队列,还会有些其他的应用的例子吧。



* Binary Heap

实现上PriorityQueue可以用BinaryHeap结构,也可用于排序。这里的数组下标为方便是从0而没有从1开始算的。

def heapify(lst,i,size=None,less=lambda x,y:x<y):
    left, right = 2*i+1, 2*(i+1)
    size = size if size else len(lst)
    largest = i
    if left<size and less(lst[largest],lst[left]):#lst[left]>lst[largest]:
        largest = left
    if right<size and less(lst[largest],lst[right]):
        largest = right
    if largest!=i:
        lst[i],lst[largest] = lst[largest], lst[i]
        heapify(lst,largest,size=size)
def build_heap(lst):
    for i in range(int(len(lst)/2),-1,-1):
        heapify(lst,i)
def heap_sort(lst):
    build_heap(lst)
    for i in range(len(lst)-1,0,-1):
        lst[0], lst[i] = lst[i], lst[0]
        heapify(lst,0,size=i)
class PriorityQueue:
    def __init__(self,less=lambda x,y:x<y):
        self.less = less
        self.container = list()
    def push(self,x):
        lst = self.container
        parent = lambda x:int((i-1)/2)
        lst.append(x)
        i = len(lst)-1
        p = parent(i)
        while i>0 and self.less(lst[p],lst[i]):
            lst[i],lst[p] = lst[p], lst[i]
            i = p; p = parent(i)
    def pop(self):
        if not self.empty():
            lst = self.container
            x, lst[0] = lst[0], lst[-1]
            lst.pop()
            heapify(lst,0,less=self.less)
            return x
        else:
            raise "underflow"
    def top(self):
        if not self.empty():
            return self.container[0]
        else:
            raise "empty"
    def size(self):
        return len(self.container)
    def empty(self):
        return self.size()==0
        



* Huffman Coding

一个PriorityQueue的实际的例子,用于数据压缩。

#include <iostream>
#include <vector>
#include <queue>
#include <map>
#include <bitset>
#include <cstring>
using namespace std;
struct Node {
    char data;
    int freq;
    Node* left;
    Node* right;
    explicit Node(char data,int freq)
        :data(data),freq(freq),left(NULL),right(NULL) {
    }
    explicit Node(Node* left,Node* right)
        :data('\0'),freq(left->freq+right->freq),left(left),right(right) {
    }
    ~Node(){
        if(left)
            delete left;
        if(right)
            delete right;
    }
    bool isLeaf() const {
        return left==NULL or right==NULL;
    }
};
struct NodeCmp {
    bool operator()(const Node* n1,const Node* n2) {
        return n1->freq > n2->freq;
    }
};
Node* huffman_tree(const map<char,int>& entries) {
    priority_queue<Node*,vector<Node*>,NodeCmp> pq;
    for(map<char,int>::const_iterator it=entries.begin(); it!=entries.end(); ++it )
        pq.push(new Node(it->first,it->second));
    int n = entries.size();
    //while(pq.size()>1){
    for(int i=0; i<n-1; i++) {
        Node* x = pq.top();
        pq.pop();
        Node* y = pq.top();
        pq.pop();
        Node* n=new Node(x,y);
        pq.push(n);
    }
    return pq.top();
}
void huffman_count(string text,map<char,int>& result) {
    for(unsigned i=0; i<text.size(); i++)
        result[text[i]]++;
}
void huffman_travel(Node* node,vector<bool>& code,map<char, vector<bool> >& result) {
    if(node->isLeaf())
        result[node->data] = code;
    else {
        {
            vector<bool> left(code);
            left.push_back(false);
            huffman_travel(node->left,left,result);
        }{
            vector<bool> right(code);
            right.push_back(true);
            huffman_travel(node->right,right,result);
        }
    }
}

void huffman_code(Node* tree,map<char, vector<bool> >& result) {
    vector<bool> empty;
    huffman_travel(tree,empty,result);
}
void huffman_print(map<char, vector<bool> >& code) {
    cout<<"dict\n";
    for (map<char, vector<bool> >::iterator i=code.begin() ; i != code.end(); ++i ) {
        cout << i->first << " => ";
        for(vector<bool>::iterator j=i->second.begin(); j!=i->second.end(); ++j) {
            cout <<((*j)?'1':'0');
        }
        cout<<endl;
    }
}
void huffman_print(vector<bool>& code) {
    cout<<"list\n";
    for(unsigned i=0; i<code.size(); i++)
        cout<<code[i];
    cout<<endl;
}
void huffman_print(const char* code) {
    cout<<"string\n"<<code<<endl;
}
Node* huffman_encode(const string& text,map<char, vector<bool> >& dict,vector<bool>& result) {
    map<char,int> entries;
    huffman_count(text,entries);
    Node* tree = huffman_tree(entries);
    huffman_code(tree,dict);
    for(unsigned i=0; i<text.size(); i++) {
        const vector<bool>& append = dict[text[i]];
        copy(append.begin(),append.end(),back_insert_iterator< vector<bool> >(result));
    }
    return tree;
}
void huffman_decode(vector<bool>& code,Node* const tree,string& result) {
    Node* p = tree;
    for(unsigned i=0; i<code.size(); i++) {
        bool bit = code[i];
        p = (bit==false)?p->left:p->right;
        if(p->isLeaf()) {
            result.push_back(p->data);
            p = tree;
        }
    }
}
int main(int argc, char **argv) {
    const char* text = "William Shakespeare (baptised 26 April 1564; died 23 April 1616) was an English poet and playwright, widely regarded as the greatest writer in the English language and the world's pre-eminent dramatist. He is often called England's national poet and the \"Bard of Avon\". His surviving works, including some collaborations, consist of about 38 plays, 154 sonnets, two long narrative poems, and several other poems. His plays have been translated into every major living language and are performed more often than those of any other playwright.";
    map<char, vector<bool> > code;
    vector<bool> encode;
    Node* tree = huffman_encode(text,code,encode);
    huffman_print(code);
    huffman_print(encode);
    cout<<"bit length "<<strlen(text)*sizeof(char)*8<<"->"<<encode.size()<<endl;
    string decode;
    huffman_decode(encode,tree,decode);
    huffman_print(decode.c_str());
    delete tree;
    return 0;

2012年5月9日星期三

代码练习 - 容器

一组简单的代码练习,实现了基本的容器类型。
小小的活动手指,了解相关的运行调试编译环境。


* 代码练习一,简单的双向链表。

增删元素时,节点对前后节点的引用可千万别弄错呦。
包括自己的以及前后节点的都得修改,有遗漏就糟糕了。
依赖调试就不好了,得能够手写一遍就正确才可以的。

class DList(object):
    """
    Doubly linked list
    link: http://www.w3schools.com/jsref/jsref_obj_array.asp
    """
    class Node:
        def __init__(self,item,prev,next):
            self.item = item
            self.prev = prev
            self.next = next
    def __init__(self):
        self.sentinel = self.Node(None,None,None)
        self.sentinel.prev = self.sentinel
        self.sentinel.next = self.sentinel
        self.len = 0
    def length(self):
        for k,v in enumerate(self,1):
            pass
        else:
            assert k == self.len
            return k
    def __len__(self):
        return self.len
    def shift(self):
        if self.sentinel.next is self.sentinel:
            raise Exception("empty")
        self.len -= 1
        node = self.sentinel.next
        self._remove(node)
        return node.item
    def unshift(self,item):
        self.len += 1
        #self._insert(item,self.sentinel.next)
        self.sentinel.next = self.Node(item,self.sentinel,self.sentinel.next)
        self.sentinel.next.next.prev = self.sentinel.next
        return self.len
    def push(self,item):
        self.len += 1
        self._insert(item,self.sentinel)
        #self.sentinel.prev = self.Node(item,self.sentinel.prev,self.sentinel)
        #self.sentinel.prev.prev.next = self.sentinel.prev
        return self.len
    def pop(self):
        if self.sentinel.prev is self.sentinel:
            raise Exception("empty")
        self.len -= 1
        node = self.sentinel.prev
        self._remove(node)
        #self.sentinel.prev.prev.next = self.sentinel.prev.next
        #self.sentinel.prev = self.sentinel.prev.prev
        return node.item
    def indexOf(self,item):
        for k,v in enumerate(self):
            if v==item:
                return k
        else:
            return -1
    def lastIndexOf(self,item):
        for k,v in enumerate(reversed(self)):
            if v==item:
                return k
        else:
            return -1
    def slice(self,start,end):
        l = self.__class__()
        length = self.length()
        begin = start+length if start<0 else start
        stop = end+length if end<0 else end
        for k,v in enumerate(self):
            if k >= stop:
                break
            if k >= begin:
                l.push(v)
        return l
    @classmethod
    def _remove(cls,node):
        node.prev.next = node.next
        node.next.prev = node.prev
    @classmethod
    def _insert(cls,item,node):
        node.prev.next = cls.Node(item,node.prev,node)
        node.prev = node.prev.next
    def splice(self,start,howmany,*items):
        l = self.__class__()
        length = self.length()
        begin = start+length if start<0 else start
        stop = begin + howmany
        k, p = 0, self.sentinel.next
        while p is not self.sentinel:
            if k >= stop:
                break
            if k >= begin:
                l.push(p.item)
                self._remove(p)
                self.len -= 1
            p = p.next
            k += 1
        for i in items:
            self.len += 1
            self._insert(i,p)
        return l
    def _copy(self):
        l = self.__class__()
        for i in self:
            l.push(i)
        return l
    def concat(self,*lists):
        l = self._copy()
        for i in lists:
            for j in i:
                l.push(j)
        return l
    def join(self,sep):
        assert hasattr(sep,'join')
        return sep.join(map(str,self))
    def reverse(self):
        p = self.sentinel
        while True:
            q = p.next
            p.prev, p.next = p.next, p.prev
            p = q
            if p is self.sentinel:
                break
    def sort(self,cmp=lambda x,y:x-y):
        rest = self.sentinel.next
        while rest is not self.sentinel:
            m, p =rest, rest.next
            while p is not self.sentinel:
                if cmp(p.item,m.item)<0:
                    m = p
                p = p.next
            if rest is m:
                rest = m.next
            #self._remove(m);self._insert(m.item,rest)
            m.prev.next, m.next.prev = m.next, m.prev
            m.prev, m.next = rest.prev, rest
            rest.prev.next, rest.prev = m, m
    def __iter__(self):
        p = self.sentinel.next
        while p is not self.sentinel:
            yield p.item
            p = p.next
    def __reversed__(self):
        p = self.sentinel.prev
        while p is not self.sentinel:
            yield p.item
            p = p.prev
    def __str__(self):
        return ','.join(map(str,self))
    def __repr__(self):
        return '[%s]'%', '.join(map(str,self))
    def valueOf(self):
        return str(self)
    def toString(self):
        return repr(self)
    def map(self,proc):
        l = self.__class__()
        for i in self:
            l.push(proc(i))
        return l
    def forEach(self,proc):
        for i in self:
            proc(i)

这里通过私有的remove和insert方法来增删节点比直接增删节点要不容易错一些。






* 代码小练习二,仿JavaScript数组的实现。

接着前面的那一个再做个小练习,不过咱好菜呀,居然写得好费尽的感觉(沮丧。
下标和范围什么的总是搞错,单个函数一长的话就头晕晕的了,而且最后还是依赖调试了。
这样是不行的,真的不行的,在非类型安全的语言或者纸张上书写的话这样就写不出正确的代码是不行的。
嘛,或许再熟练一些就会好一些吧,对于常见的情况抽象为函数或固定的代码。恩,或许就是这样的。

<script>
var print = function(x) {
    document.write(x);
    document.writeln("<br>");
};
var p = print;
print("hello world");
function JArray() {
    var that = (this instanceof arguments.callee) ? this : new arguments.callee();
    if(arguments.length == 1 && this instanceof arguments.callee)
        that.length = arguments[0];
    else {
        that.length = 0;
        for(var i = 0; i < arguments.length; i++)
            that.push(arguments[i])

    }
    return that;
};
JArray.prototype.push = function(item) {
    this[this.length] = item;
    return ++this.length;
};
JArray.prototype.pop = function(item) {
    if(this.length == 0)
        return undefined;
    this.length--;
    var item = this[this.length];
    delete this[this.length];
    return item;
};
JArray.prototype.shift = function() {
    if(this.length == 0)
        return undefined;
    var item = this[0];
    for(var i = 0; i < this.length; i++)
        this[i] = this[i + 1];
    this.length--;
    delete this[this.length]
    return item;
};
JArray.prototype.unshift = function(item) {
    for(var i = this.length; i > 0; i--)
        this[i] = this[i - 1];
    this[0] = item;
    return ++this.length;
};
JArray.prototype.join = function(sep) {
    var sep = sep || ',';
    var string = this.length == 0 ? "" : this[0];
    for(var i = 1; i < this.length; i++) {
        string += sep + this[i];
    };
    return string;
};
JArray.prototype.toString = function(item) {
    return this.join();
};
JArray.prototype.slice = function(start, end) {
    var array = new this.constructor();
    var start = start < 0 ? start + this.length : start;
    var end = end || this.length;
    end = end < 0 ? end + this.length : end;
    for(var i = start; i < end && i < this.length; i++)
        array.push(this[i]);
    return array;
};
JArray.prototype.indexOf = function(item) {
    for(var i = 0; i < this.length; i++)
        if(this[i] == item)
            return i;
    return -1;
};
JArray.prototype.lastIndexOf = function(item) {
    for(var i = this.length - 1; i >= 0; i--)
        if(this[i] == item)
            return i;
    return -1;
};
JArray.prototype.concat = function() {
    var array = new this.constructor();
    for(var i = 0; i < this.length; i++)
        array.push(this[i]);
    for(var i = 0; i < arguments.length; i++)
        for(var j = 0; j < arguments[i].length; j++)
            array.push(arguments[i][j]);
    return array;
};
JArray.prototype.splice = function(index, howmany, item1) {
    var array = new this.constructor();
    var index = index < 0 ? index + this.length : index;
    var howmany = Math.min(this.length - index, howmany);
    var end = index + howmany;
    for(var i = index; i < end; i++)
        array.push(this[i]);
    var itemnum = arguments.length - 2;
    if(howmany > itemnum) {
        var eat = howmany - itemnum;
        for(var i = end; i < this.length; i++)
            this[i - eat] = this[i];
        for(var i = this.length - eat; i < this.length; i++)
            delete this[i];
        this.length -= eat;
    } else {
        var room = itemnum - howmany;
        for(var i = this.length - 1; i >= index; i--)
            this[i + room] = this[i];
        this.length += room;
    }
    for(var i = 0; i < itemnum; i++)
        this[index + i] = arguments[i + 2];
    return array;
};
JArray.prototype.reverse = function(start, howmany, item1) {
    for(var i = 0, j = this.length - 1; i < j; i++, j--) {
        var tmp = this[i];
        this[i] = this[j];
        this[j] = tmp
    };
};
JArray.prototype.sort = function(cmp) {
    var cmp = cmp || (function(x, y) {
        return x < y ? -1 : 1
    });
    var partition = function(array, x, y) {
        var pivot = array[x];
        var z = x;
        for(var i = x + 1; i < y; i++) {
            var tmp = array[i];
            if(cmp(tmp, pivot) < 0) {
                z++;
                array[i] = array[z];
                array[z] = tmp;
            }
        }
        array[x] = array[z];
        array[z] = pivot;
        return z;
    };
    var qsort = function(array, x, y) {
        if(x < y) {
            var z = partition(array, x, y);
            qsort(array, x, z);
            qsort(array, z + 1, y);
        }
    };
    qsort(this, 0, this.length);
    return this;
};
JArray.prototype.forEach = function(proc) {
    for(var i = 0; i < this.length; i++)
        proc(this[i])
};
JArray.prototype.every = function(proc) {
    for(var i = 0; i < this.length; i++)
        if(!proc(this[i]))
            return false;
    return true;
};
</script>

测试代码没有贴,不过最好其实还是给每个方法都对应几种情况写好自动测试为好吧这样。
好写细节和js本身的Array不一样,比如一些可选参数的情况,这里仅做拙劣的练习就不完善它了。
link: https://developer.mozilla.org/en/JavaScript/New_in_JavaScript/1.6#Array_extras



* 代码小练习三,简单的关联数组。

一个简单的关联数组,key限定为字符串类型。


#include <iostream>
#include <cstring>
using namespace std;
template<class T>
class Hash{
    class NotFound{};
    class Full{};
    struct Entry{
        const char* key;
        T value;
        };
    int _capacity;
    Entry* table;
    Entry* search(const char* key){
        int hash = strhash(key)%_capacity;
        for(int i=hash;i<hash+_capacity;i++){
            Entry* entry = &table[i%_capacity];
            if(entry->key==NULL || streq(entry->key,key))
                return entry;
        }
        return NULL;
    }
public:
    static int strhash(const char* str){
        int hash = 0;
        for(const char* p = str; *p; p++)
            hash = 31 * hash + *p;
        return hash;
        }
    static bool streq(const char* str1,const char* str2){
            return !strcmp(str1,str2);
        }
    Hash(int capacity)
        :_capacity(capacity),table(new Entry[capacity]){
            for(int i=0;i<capacity;i++)
                table[i].key = NULL;
        }
    ~Hash(){
        delete[] table;
        }
    void put(const char* key,T value){
        Entry* entry = this->search(key);
        if(entry){
            if(entry->key==NULL)
                entry->key = key;
            entry->value = value;
        }else
            throw Full();//"Full";
        }
    void del(const char* key){
        Entry* entry = this->search(key);
        if(entry && entry->key){
            entry->key=NULL;
        entry->value=NULL;
        }
    }
    T get(const char* key){
        Entry* entry = this->search(key);
        if(entry && entry->key)
            return entry->value;
        else
            throw NotFound();//"Not Found";
    }
    bool has(const char* key){
        Entry* entry = this->search(key);
        return entry && entry->key;
    }
    };
int main(int argc, char **argv)
{
    cout<<"hello world"<<endl;
    Hash<int> hash(100);
    cout<<hash.has("a")<<endl;
    hash.put("a",125);
    hash.put("b",128);
    cout<<hash.has("a")<<endl;
    cout<<hash.has("b")<<endl;
    cout<<hash.get("b")<<endl;
    hash.del("b");
    cout<<hash.has("b")<<endl;
    //cout<<hash.get("c")<<endl;
    return 0;
}



* 代码练习四,优先队列。

2012年4月25日星期三

Blogger后台截图留恋




由于Blogger比Gmal更早一些,所以帐号就不是Gmail的帐号了。
这个后台界面挺让人怀念的,毕竟用了很长时间,被GPlus了真讨厌呢。
有些是草稿,这个貌似之前说到过了,没想到挺占屏幕的。
没写过什么有价值的东西,都是自言自语胡思乱想的东西。
以前blogger和youtube都是可以很方便就能上的站点呢。
当初google还不强推什么googleplus这个整合地令人讨厌的东西呢。
当初墙还对ssl无奈呢,当初网页不是css布局,科技发展真是飞快呀。
当初知道窗口原来还可以用标题栏拖动位置,还真是吃了一惊呢。
当初还对着键盘敲着dir,然后进wps,然后打字打字打字只会打字呢。
回忆真是无趣,明明未来将获得更美好的东西,明明回忆是蕴含着——
希望,财富,信念,憧憬,活力的的过去时光,
何必总以未来的没有到来而让回忆就总染上了橘红色的悲伤气氛呢。

所谓“永恒”呢,或许它不是历史长河中注定的一尘不变的遗忘,而是在过去与将来的种种的不相干的事物之中呈现出来的一致面。一份由过去的经验与财富给当前留下的踏实感,一份由未来可以继续传递下去的憧憬和希望,这些都是由“永恒”所连接起的美好回忆与期盼的那一份璀璨纽带。

2012年4月8日星期日

一种Scheme语言中基于消息传递的面向对象机制 上 -- 语言机制

以下代码是单纯为了试一试在Scheme里面使用使用面向对象机制而写。
大致可行不过仍需完善,利用更多常见的使用情况还需要单独地特别考虑一下。

代码贴在这里格式乱掉了。。。

其实OO有很多中实现方式,这里主要是使用闭包了。
闭包有天然的作用域可以作为访问控制
----
http://pastebin.com/ifbtVEPS

;#lang r5rs
;(define error display)
;;;一种Scheme语言中基于消息传递的面向对象机制 上 -- 语言机制 - Closure
;;;面向对象的机制的实现方式很都多种,常见的比如基于向量和隐藏this指针的。
;;;这里是基于程序语言中的闭包特性来实现的,尝试了面向对象机制中的一些常见情况。
;;;参考1 OCaml http://caml.inria.fr/pub/docs/u3-ocaml/ocaml-objects.html
;;;参考2 SICP http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-17.html
;;;参考3 R5RS http://www.schemers.org/Documents/Standards/R5RS/HTML/
;;;查找对象方法的时间复杂度依赖所用实现的case语句的实现方式,若有优化则为O(1)。


;;;========================= 定义部分 ============================

;;;function send
;;;向对象发送消息
;;;type = closure * symbol * object list -> object
;;;example. (send pair 'set-car! 1) == (set-car! pair 1)
(define (send object method . arguments)
(cond ((object method)
=>(lambda (x) (apply x arguments)))
(else (error "method missing"))))
;;;function interface
;;;检验对象是否实现方法,注意方法名一致不能保证方法的语义正确
;;;type = symbol list -> (closure -> boolean)
;;;example. sequence? == (interface 'car 'cdr)
(define (interface . methods)
(lambda (object)
(let loop ((methods methods))
(cond ((null? methods) #t)
((not (object (car methods))) #f)
(else (loop (cdr methods)))))))
;;;functon coerce
;;;将对象绑定的若干方法表示为列表,用于调用泛型函数使用
;;;type = symbol list -> procedure list
;;;example. (coerce 'car 'cdr)
(define (coerce . methods)
(map (lambda (method)
(lambda (object . arguments) (apply send object method arguments)))
methods))
;;;function method
;;;从对象链中查找方法
;;;type = symbol * closure list -> (procedure | false)
;;;example. ((method 'car pair)) == (send pair 'car)
(define (method symbol . objects)
(let loop ((protos objects))
(cond ((null? protos) #f)
(((car protos) symbol) => (lambda (x) x))
(else (loop (cdr protos))))))
;;;macro object
;;;创建对象
;;;syntax = (object (prototypes ...) ((method (arguments ...) body ...) ...))
;;;example. (send (object () ((one () 1)))) 'one) == 1
(define-syntax object
(syntax-rules ()
((_ (prototypes ...) ((method-name (arguments ...) body ...) ...))
(lambda (m)
(case m
((method-name) (lambda (arguments ...) body ...)) ...
(else (apply method m (list prototypes ...))))))))
;;;macro thunk
;;;用于延迟绑定
;;;syntax = (thunk object)
;;;example. (thunk 0)
(define-syntax thunk
(syntax-rules () ((_ x) (lambda () x))))
;;;;macro object2
;;;上述“macro object”的 rich 版本
;;;syntax = (object2 self (prototype-news ...) ((slot value) ...) ((method (arguments ...) body ...) ...) init ...)
;;;example. <404>
(define-syntax object2;define
(syntax-rules ()
((_ self (prototype-news ...) ((slot value) ...) ((method (arguments ...) body ...) ...) init ...)
(letrec ((slot value) ...
(self
(lambda (m)
(case m
((method) (lambda (arguments ...) body ...)) ...
(else (let loop ((protos (list (prototype-news (lambda () self)) ...)))
(cond ((null? protos) #f)
(((car protos) m) => (lambda (x) x))
(else (loop (cdr protos))))))))))
init ... self))))
;;;object top-object2
;;;用于继承树的根节点
(define (top-object2 this)
(object2 self () () ((init () (this)))))
;;;function new
;;;用来构造一个对象
(define (new class . arguments)
(define self (apply send (class (thunk self)) 'init arguments))
self)
;;;========================= 示例部分 ============================

;;;* 示例一 Pair::mcons
;;;该示例用来演示如何创建一个对象
;;;定义A
(define (mcons0 x y)
(lambda (m)
(case m
((car) (lambda () x))
((cdr) (lambda () y))
((set-car) (lambda (z) (set! x z)))
((set-cdr) (lambda (z) (set! x z)))
((->pair) (lambda ()(cons x y)))
(else #f))))
;;;定义B
(define (mcons x y)
(object ()
((car () x)
(cdr () y)
(set-car (z) (set! x z))
(set-cdr (z) (set! y z))
(->pair () (cons x y)))))
;;;泛型
(define (mcar mpair)
((mpair 'car)))
(define (mcdr mpair)
((mpair 'cdr)))
(define (set-mcar mpair obj)
((mpair 'set-car) obj))
(define (set-mcdr mpair obj)
((mpair 'set-cdr) obj))
(define (mpair->pair mpair)
((mpair '->pair)))
;;;使用
(let example-1 ()
(define x (mcons 1 2))
(display (mcdr x))
(newline))

;;;* 示例二 sequence
;;;该示例用来演示接口和泛型的使用
(define sequence? (interface 'car 'cdr))
(define (mlist . objects)
(if (null? objects) '()
(mcons (car objects) (apply mlist (cdr objects)))))
(define (for-each0 p lst)
(cond ((null? lst) '())
(else (p (car lst)) (for-each p (cdr lst)))))
(define (mfor-each p lst)
(cond ((null? lst) '())
(else (p (mcar lst)) (mfor-each p (mcdr lst)))))
(define (generic-for-each prod lst car cdr)
(cond ((null? lst) '())
(else (prod (mcar lst)) (generic-for-each prod (mcdr lst) car cdr))))
(define (mrange start end step)
(letrec ((self (object()
((car () start)
(cdr () (let ((x (+ (send self 'car) step)))
(if (cond ((> step 0) (<= start x end))
((< step 0) (>= start x end))
(else (error "mrange")))
(mrange x end step) '())))))))
self))
(let example-2 ()
(define x (mlist 1 2 3 4 5))
(mfor-each display x)
(display (sequence? x))
;(mfor-each display (mrange 0 10 2))
;(generic-for-each display (mrange 0 10 2) mcar mcdr)
(apply generic-for-each display (mrange 0 -10 -3) (coerce 'car 'cdr))
(newline))

;;;* 示例三 slots
;;;该示例用来演示构造器和私有成员
(define (box x)
(letrec ((value '())
(self (object()
((ref () value)
(set! (x) (set! value x))
(add! (x) (send self 'set! (+ x (send self 'ref))))))))
(set! value x)
self))
(let example-3 ()
(define x (box 1))
(display (send x 'ref))
(send x 'set! 2)
(display (send x 'ref))
(send x 'add! 2)
(display (send x 'ref))
(newline))

;;;* 示例四 inheritance
;;;该示例用来演示扩充一个类
(define (mlist2 . objects)
(define super (apply mlist objects))
(define self
(object (super)
((for-each (prod) (apply generic-for-each prod self (coerce 'car 'cdr))))))
self)
(let example-4 ()
(define x (mlist2 1 2 3 4 5))
(mfor-each display x)
(send x 'for-each display)
(newline))

;;;* 实例五 virtual/override
;;;该示例用来演示模板方法模式
(let example-5 ()
(define (hello name)
(letrec((self
(object ()
((name () (name))
(say () (for-each display (list "hello " (send self 'name) "!\n")))))))
self))
(send (hello (lambda () "world0")) 'say)
(define (hello-world)
(define super (hello (lambda () (send self 'name))))
(define self (object (super) ((name () "world1"))))
self)
(send (hello-world) 'say)
(define (hello2 this)
(object ()
((name () "")
(say () (for-each display (list "hello " (send (this) 'name) "!\n"))))))
(define (hello-world2)
(define self
(object
((hello2 (thunk self)))
((name () "world2"))))
self)
(send (hello-world2) 'say)
(define (hello-world3)
(define self
(object2 self (top-object2 hello2) ()
((name () "world3"))))
self)
(send (hello-world3) 'say)
(define (hello-world4 this)
(define self
(object2 self (top-object2 hello2) ()
((name () "world4"))))
self)
(send (new hello-world4) 'say)
(newline))

;;;待续 一种Scheme语言中基于消息传递的面向对象机制 下 -- 使用模式


-----小改动,直接放在下面了----
不过话说这样更像Java而不是OCaml了。。。

(define-syntax object3;define
(syntax-rules (using method)
((_ ((method-name (arguments ...) body ...) ...))
(lambda (m)
(case m
((method-name) (lambda (arguments ...) body ...)) ...
(else #f))))
((_ super class using (traits ...) self ((slot value) ...) method ((method-name (arguments ...) body ...) ...))
(letrec ((super (class (lambda () self)))
(slot value) ...
(self
(lambda (m)
(case m
((method-name ) (lambda (arguments ...) body ...)) ...
(else (let loop ((protos (list super (traits (lambda () self)) ...)))
(cond ((null? protos) #f)
(((car protos) m) => (lambda (x) x))
(else (loop (cdr protos))))))))))
self))
((_ super class self ((slot value) ...) method ((method-name (arguments ...) body ...) ...))
(_ super class using () self ((slot value) ...) method ((method-name (arguments ...) body ...) ...)))
((_ super class self method ((method-name (arguments ...) body ...) ...))
(_ super class using () self () method ((method-name (arguments ...) body ...) ...)))
))
(define (root this)
(object3 ((init () (this)))))
(define (new class . arguments)
(define self (class (thunk self)))
(apply send (class (thunk self)) 'init arguments)
self)
(define (hello-world6 this)
(object3 super root using (hello2) self () method
((init () (send super 'init))
(name () "world6")
(say1 () (send (this) 'say)))))
(send (new hello-world6) 'say1)

-----
这样下去,干脆定义一个define-class算了。。。
用super表示父类(只有一个父类)this表示子类的对象,self表示当前类的对象。
这样好像把问题弄得复杂一下子就全部放在表面的问题。
由于用的是R5RS的“高级”“卫生”宏/语法,所以功能上有较多约束,不过有这些约束写起来还是更清爽了一些。
----
补充,感觉还是让object语法“小”一些比较好,多父类的判断可以移到语法宏的外面。
此外,多个原型也可以在子类中选择的调用,即不止一个spuer代表某个原型吧。
恩,其实OCaml里的对象是作为一种结构来实现的,要类型的话让闭包返回类型的联合也成吧。
嘛,这里也没多说清楚是怎么回事情,不过就是再补充一下了。