Hmm, well going from that presentation
http://bix.ucsd.edu/bioalgorithms/presentations/Ch08_GraphsDNAseq.pdf
on slide 37, I think that if you had, for example, a 4-mer like ATGC then this would necessitate two 3-mer NODEs, ATG and TGC, with an EDGE between them. The latter would then need further EDGE connections for any 4-mer of the form TGCx from TGC to GCx. If you had k-mers then the NODEs are (k-1)-mers and the required string is the Euler path (visits each EDGE once: Euler's famous Konigsberg bridges problem).
Sadly, what I can't see from the algorithm, on slides 41-43, is what happens if you end up with a disconnected graph (other than simply appending the strings for each disconnected section).
I can start you off (perhaps), but I'm not sure that I could finish it. There are better graph theorists on this site ... ! Definitely need someone to advise on the best way to store NODEs and EDGEs in a network GRAPH.
Anyway,
- Read in all your k-mers (into a vector of strings).
- From these set up all the (k-1)-mer NODEs corresponding to first and last k-1 chars of the k-mers. Put these in an array or vector of NODE objects. A NODE class is going to have to store (a) its value (here, a string) and possibly (b) the collection of EDGEs that it starts (or pointers to the NODEs that these go to).
- Remove any duplicates from the (k-1)-mers.
- From your given k-mers set up all the EDGEs needed to connect the (k-1) mers.
- From these NODEs and EDGEs carry out the Euler path algorithm given in that presentation to get the shortest superstring (or superstrings if there are multiple disconnected subgraphs).
- Write out your shortest superstring.
Not sure if that is much of a help. This seems quite an advanced computational project. Google "Graph theory in C++" to see appropriate forms for your NODE class and how to store and use network GRAPHs (effectively, NODEs and EDGEs). EDIT: looks like @kemort has provided a good link below.
It is possible that you will get some lectures on coding network graphs during your course. Seems rather a lot to take on without any guidance.