Programing Problem: Partition a List by Equivalence

By Xah Lee. Date: . Last updated: .

Here's another interesting programing problem , again from part of a larger program in the previous series.

Problem Spec

Write a function merge.

merge(listOfPairs) takes a list of pairs, each pair indicates the sameness of the two indexes. Returns a partitioned list of same indexes.

For example, if the input is

merge( [ [1,2], [2,4], [5,6] ] )

that means 1 and 2 are the same. 2 and 4 are the same. Therefore 1==2==4. The result returned is

[[4,2,1],[6,5]]

(ordering of the returned list and sublists are not specified.)


Answer below.

Perl Solution

# -*- coding: utf-8 -*-
# perl

sub merge($) {
my @pairings = @{$_[0]}; # for example: ([a,b], [c,d], etc)

my @interm; # array of hashs. For the hash, Keys are numbers, values are dummy 'x'.

# chop the first value of @pairings into @interm
$interm[0]={$pairings[0][0]=>'x'}; ${interm[0]}{$pairings[0][1]}='x'; shift @pairings;

 N1: for my $aPair (@pairings) {
     for my $aGroup (@interm) {
         if (exists ${$aGroup}{$aPair->[0]}) {${$aGroup}{$aPair->[1]}='x'; next N1}
         if (exists ${$aGroup}{$aPair->[1]}) {${$aGroup}{$aPair->[0]}='x'; next N1}
     }
     push @interm, {$aPair->[0]=>'x'}; ${interm[-1]}{$aPair->[1]}='x';
 }

my @fin = shift @interm;

N2: for my $group (@interm) {
    for my $newcoup (@fin) {
        foreach my $k (keys %$group) {
            if (exists ${$newcoup}{$k}) {map { ${$newcoup}{$_}='x'} (keys %$group); next N2;}
  }
    }
    push @fin, $group;
}
return map {[keys (%$_)]} @fin;
}

Python Solution

Here's a direct translation of the Perl code above into Python:

# -*- coding: utf-8 -*-
# python 2

def merge(pairings): # pairings is a list of couples. For example, [(9,2),(7,6), <var class="d">…</var>]

    # interm is a list of groups. Each group is a list that hold
    # equivalent numbers. interm stands for interim result. Each group
    # is a dictionary. Keys are numbers, values are all dummy
    # 'x'. Dictionary is used for ease of dealing with duplicates or
    # checking existence.
    interm=[];

    # move first pair of pairings into interm as the first group
    interm.append({pairings[0][0]:'x', pairings[0][1]:'x'}) ; del pairings[0]

    # go thru pairings. For each pair, check if it is in any group in
    # interm. If any part of pair is in a group, then add the other
    # part into that group. If no part of the pair is in any group,
    # then add this pair into interm as a new group.
    for aPair in pairings:
        for aGroup in interm:
            if (aGroup.has_key(aPair[0])): aGroup[aPair[1]]='x'; break
            if (aGroup.has_key(aPair[1])): aGroup[aPair[0]]='x'; break
        else: interm.append( {aPair[0]:'x'} ); interm[-1][aPair[1]]='x'

    # now make another pass of the groups in interm, because some pair
    # that may connect two groups (i.e. with one element in one group,
    # and second element in another group), yet the pair is simply
    # consumed by one group.
    # This pass will check if there are any element in any other
    # group, if so, such two groups will be unioned. In this pass, we
    # move things from interm into fin. fin==final.


    fin=[]; fin.append(interm.pop(0))
    goto=False
    for group in interm:
        for newcoup in fin:
            for k in group.keys():
                if newcoup.has_key(k):
                    newcoup.update(group);
                    goto=True
                    break
            if (goto):
                goto=False
                break
        else:
            fin.append(group)

    # now turn the dictionaries into lists for return value
    result=[];
    for group in fin: result.append(group.keys())
    return result

I wrote this (Perl) program in 2003-09, and now basically forgot all about the internals. The original Perl code does not have inline comments, nor public consumable documentation as this is my own project. In the process of translation and the publication and explanation on this page, i eventually have to re-acquaint the algorithm i used as i go thru the lines. I was thinking of a quick brainless translation word-for-word, but that turned out not possible as i run into problems.

(While i'm learning Python, i run into frustrations with the Python Documentation. (although it has far more quality than Perl documentations). The frustrations with documentations will be appended to this page later: Python Documentation Problems.)

The Problem of Translating Goto Statement

The translation problem i run into is this. In Perl, there's a GOTO construct where in a loop one can say "next XYZ" to jump to a arbitrary outer loop labeled XYZ. Python has “break” but does not provide a GOTO jump as in Perl. In the process, i have to actually figure out (for the first time for me) how loops with GOTO jumps can be translated to alternative structure. This turned out not too hard. For a GOTO jump to a far outer group, one can use multiple if/break at the end of each loop, possibly in addiction adding a “else” clause to the different levels of the loops. (Python language can have a “else” clause for “for” loops. It is executed when the loop completes. (as opposed to when a break inside jumped out))

Here is a loop with GOTO, translated into Python without:

N1: for my $aPair (@pairings) {
     for my $aGroup (@interm) {
         if (exists ${$aGroup}{$aPair-&gt;[0]}) {${$aGroup}{$aPair-&gt;[1]}='x'; next N1}
         if (exists ${$aGroup}{$aPair-&gt;[1]}) {${$aGroup}{$aPair-&gt;[0]}='x'; next N1}
     }
     push @interm, {$aPair-&gt;[0]=&gt;'x'}; ${interm[-1]}{$aPair-&gt;[1]}='x';
 }
for aPair in pairings:
        for aGroup in interm:
            if (aGroup.has_key(aPair[0])): aGroup[aPair[1]]='x'; break
            if (aGroup.has_key(aPair[1])): aGroup[aPair[0]]='x'; break
        else: interm.append( {aPair[0]:'x'} ); interm[-1][aPair[1]]='x'

Below is another loop with GOTO that jumps 2 levels out. Notice the use of a goto marker.

N2: for my $group (@interm) {
    for my $newcoup (@fin) {
        foreach my $k (keys %$group) {
            if (exists ${$newcoup}{$k}) {map { ${$newcoup}{$_}='x'} (keys %$group); next N2;}
  }
    }
    push @fin, $group;
}
goto=False
    for group in interm:
        for newcoup in fin:
            for k in group.keys():
                if newcoup.has_key(k):
                    newcoup.update(group);
                    goto=True
                    break
            if (goto):
                goto=False
                break
        else:
            fin.append(group)

The Python version above has not been tested much, but is a direct translation of the Perl code. The Perl version is fairly reliable, because i've used it over the year as part of a larger program. However, no any formal proof or exhaustive machine testing has been done. Later i might write some program to auto-test them … but that'd be another day. The Python version can also use some clean up, or even rewrite as a program in the Python language proper. Addenda or Errata will be added on this page.