Add Louvain community detection algorithm#453
Add Louvain community detection algorithm#453Becheler wants to merge 5 commits intoboostorg:developfrom
Conversation
|
Not sure why this error is occurring only for OSX 11.7 C++14. I assume this is not specific to your code. |
jeremy-murphy
left a comment
There was a problem hiding this comment.
Just a few comments, more later.
| // Revision History: | ||
| // |
There was a problem hiding this comment.
I know embedded revision histories were a thing in the past, but I would prefer to keep all this metadata in git.
There was a problem hiding this comment.
I did not like it either, I'm glad you don't ahah :)
| std::map<VertexDescriptor, WeightType> internal_weights; | ||
| std::map<VertexDescriptor, std::set<VertexDescriptor>> vertex_mapping; |
There was a problem hiding this comment.
We generally try to avoid hard-coding the choice of data structure, especially std::map and std::set, so instead of templating VertexDescriptor and WeightType we should template the whole map type, so users can use boost::unordered_map or some other kind of property map of their own choice.
There was a problem hiding this comment.
Mhhh I see. I felt it was not in the BGL spirit to do so, and Joaquin also mentioned it, but I was not sure how to solve it without making the API heavy. Can we default to a concrete type to simplify user's experience ? Also, is it ok to use boost::unordered_map if it adds the constraint of key_type being hashable ? That was my idea behind using std::map
There was a problem hiding this comment.
Oh yes, we can still (and should) make the user experience nice with defaults. That's the great thing, we get both benefits, the cost is more work on the part of the library authors. :)
Again, I think astar is an example, but instead of using param = arg in the function definition, make an overload, like so:
auto user_friendly_foo(graph const &g) {
ConcreteA a;
ConcreteB b;
return generic_foo(g, a, b);
}
Sorry, typing on my phone, so please ignore random syntax errors.
There was a problem hiding this comment.
Umm, yeah, we probably shouldn't add new constraints into the default interface. Users will too easily assume that the constraint is mandatory.
So maybe use boost::flat_map as the default and users can always use a hash map if they want to.
| std::set<community_type> unique_communities; | ||
| std::map<community_type, vertex_descriptor> comm_to_vertex; | ||
| std::map<vertex_descriptor, std::set<vertex_descriptor>> vertex_to_originals; |
There was a problem hiding this comment.
These should almost certainly be input parameters taken by non-const reference so that the user a) decides their type and b) automatically gets their value at the end.
There was a problem hiding this comment.
So they gain access to the whole hierarchy. Ufff it's a lot of guts leaking out haha
Will do, and tell you in case of problemsm thanks again for your time !
There was a problem hiding this comment.
I could be wrong. But have a look at the astar API for some examples of prior art.
There was a problem hiding this comment.
And on second thought, this is not a priority, we can always do it later. Getting it correct and fast are higher priorities.
this was not supposed to be commited :)
this was not supposed to be commited :)
| //======================================================================= | ||
| // Copyright 2026 Becheler Code Labs for C++ Alliance | ||
| // Authors: Arnaud Becheler | ||
| // |
There was a problem hiding this comment.
You retain the copyright, so no need to mention the C++ Alliance. Also, is "Becheler Code Labs" a real legal entity? If this is not the case, then assigning (C) to your physical person would be best.
| auto unfold(const CommunityMap& final_partition) const | ||
| { | ||
| assert(!empty()); | ||
|
|
There was a problem hiding this comment.
Not sure what the BGL convention is, Boost libs generally use BOOST_ASSERT instead.
There was a problem hiding this comment.
I don't really mind, but yeah, better to follow whatever is common practice.
| current_nodes.insert(kv.first); | ||
|
|
||
| // From coarse to fine | ||
| for (int level = size() - 1; level >= 0; --level) { |
There was a problem hiding this comment.
Type mismatch, size() is a std::size_t. More idiomatic, but maybe not to your taste:
for(auto level = size(); level--; ) {|
|
||
| template <typename Graph, typename CommunityMap, typename WeightMap, typename QualityFunction = newman_and_girvan> | ||
| typename property_traits<WeightMap>::value_type | ||
| louvain_local_optimization( |
There was a problem hiding this comment.
Is this function supposed to be public?
| // L_c = internal edge weight for community c | ||
| // k_c = sum of degrees in community c | ||
| // m = total edge weight / 2 | ||
| struct newman_and_girvan |
There was a problem hiding this comment.
If users can provide their own quality function other than newman_and_girvan, you probably need to define a concept LouvainQuality of sorts. Take a look at how's done at
https://github.com/boostorg/graph/blob/develop/include/boost/graph/astar_search.hpp#L56
| bool has_rolled_back = false; | ||
|
|
||
| // Randomize vertex order once | ||
| std::mt19937 gen(seed); |
There was a problem hiding this comment.
This sould be made generic and passed by the user (with a default arg) as a URGB&&.
| // L_c = internal edge weight for community c | ||
| // k_c = sum of degrees in community c | ||
| // m = total edge weight / 2 | ||
| struct newman_and_girvan |
There was a problem hiding this comment.
Is this modularity thing a general concept or does it apply to Louvain only?
jeremy-murphy
left a comment
There was a problem hiding this comment.
Couple more requests, still going...
|
|
||
| // Track hierarchy of aggregation levels for unfolding partitions. | ||
| template <typename VertexDescriptor> | ||
| struct hierarchy_t |
There was a problem hiding this comment.
Only one data member and no real invariants to speak off suggests that you don't really need this type.
I think unfold could be a free function, maybe in a private namespace depending on how reusable it is, that takes levels and final_partition as parameters.
By the way, please only use the _t suffix for type aliases, not for names of classes.
| auto unfold(const CommunityMap& final_partition) const | ||
| { | ||
| assert(!empty()); | ||
|
|
There was a problem hiding this comment.
I don't really mind, but yeah, better to follow whatever is common practice.
Implement #447
But: