mercredi 18 mars 2015

Porting C++ map with compare to Python


I'm porting some code from C++ to Python that relies upon the C++ ordering of keys in the map using custom compare functions. I has assumed I could simply use a Python dictionary and then use sorted() on the keys using an equivalent function as a key= to sorted(). However, no matter how I try, the keys get traversed in a different order in Python.


The C++ code looks something like this:



struct foo {
int a;
int b;
int c;
int d;
int e;

operator != (const struct foo& f) const {
return (this->a != f.a) || (this->b != g.b) || ... ;
}
};

struct foo_compare_one_way {
bool operator() (const struct foo& first, const struct foo&second) {
... logic that compares values in a certain order...
}

struct foo_compare_another_way {
bool operator() (const struct foo& first, const struct foo&second) {
... logic that compares values in a another order...
}

std::map<struct foo, std::vector<int>, struct foo_compare_one_way> m1;
std::map<struct foo, std::vector<int>, struct foo_compare_another_way> m2;


... and at various points in the code they use iterators and reverse iterators over m1 and m2.


In my Python code, I tried the following:



foo = collections.namedtuple('foo',['a','b','c','d','e'])

def foo_compare_one_way(x,y):
# implement same logic from C++, returning a bool value depending
# on which element of the tuple is > or < its other value

def foo_compare_other_way(x,y):
# implement same logic from C++ for other comparison function.

m1 = {}
m2 = {}

# populate m1 and m2

for k1 in sorted(m1, key=functools.cmp_to_key(foo_compare_one_way)):
# traverse keys in order

for k2 in reversed(sorted(m2, key=functools.cmp_to_key(foo_compare_another_way))):
# traverse keys in reverse order


The problem is, the order in the C++ program is different than the order that Python traverses. In many cases, its as if the Python comparison function isn't getting called at all. Am I missing something obvious in the way Python comparison functions should be defined?


Actual example follows:


The actual C++ code is some random implementation of gSpan I found on gitub available here. See ProjectionMap and struct dfs_code_t in src/gspan.h and src/graph.h .


For the Python code, actual snippets below:



dfs_code = collections.namedtuple('dfs_code',
['fromn','to','from_label','edge_label','to_label'])

def dfs_code_compare(a, b):
if a.from_label != b.from_label:
return a.from_label < b.from_label
else:
if a.edge_label != b.edge_label:
return a.edge_label < b.edge_label
else:
return a.to_label < b.to_label

def dfs_code_backward_compare(a, b):
if a.to != b.to:
return a.to < b.to
else:
return a.edge_label < b.edge_label

# code that fills in a dictionary called pm

for pm in sorted(projection_map, key=functools.cmp_to_key(dfs_code_compare)):
print pm


The python code above produces the following ordering:



dfs_code(fromn=0, to=1, from_label=1, edge_label=0, to_label=3)
dfs_code(fromn=0, to=1, from_label=2, edge_label=3, to_label=2)
dfs_code(fromn=0, to=1, from_label=1, edge_label=3, to_label=3)
dfs_code(fromn=0, to=1, from_label=1, edge_label=1, to_label=2)
dfs_code(fromn=0, to=1, from_label=2, edge_label=1, to_label=3)
....


Which is a) different from the C++ code, and b) not at all what I'd expect from the comparison function (I'd expect all the from_labels of 1 to be grouped together at the front, for example). Any ideas?




Aucun commentaire:

Enregistrer un commentaire