Ignoring time stamps

Jul 30, 2020 at 10:01am

I am trying to write a code which ignores the timestamps of one file to compare the contents with the other similar file in which timestamps change every time. So ignoring timestamps in one file is important.
File1
14/07/20 18:27:44:410 ALT 3
14/07/20 18:27:44:411 C: ALT 0x00 Ok
14/07/20 18:27:44:411 ACXT 0 PTE
File2
14/07/20 18:26:43:409 ALT 3
14/07/20 18:26:43:410 C: ALT 0x00 Ok
14/07/20 18:26:43:410 ACXT 0 PTE

[code]
int main()
{

ifstream File1,File2;
logFile.open("file1.txt", ios::in);
string inFile1;
datumlogFile.open("file2.txt",ios::in);
string inFile2;
if(!(File1 && File2))
{
cerr<<"Files are in error"<<"\n";
return EXIT_FAILURE;
}

bool bFoundString = false;

while(getline(File1,inFile1))

{

bFoundString = false;

while(getline(File2,inFile2)) {

{

if(inFile1.find(inFile2) !=std::string::npos)
{
bFoundString = true;
cout<<"Comparison is okay"<<endl;
break;
}

else
{
//do Nothing
}
}
}
}

if(!bFoundString)
{
// cout<<"Comparison fails"<<endl;
}

return bFoundString;
}

Last edited on Jul 30, 2020 at 10:25am

Jul 30, 2020 at 10:14am

lastchance (6980)

Read each as

string date, time, value1, value2;
bool ok1 = true, ok2 = true;

while( ok1 && ok2 )
{
    ok1 == ( File1 >> date >> time && getline(  File1, value1 ) );
    ok2 == ( File2 >> date >> time && getline(  File2, value2 ) );
    if ( !( ok1 == ok2 ) || value1 != value2 ) cout << "Values differ (or files of different sizes)\n";
}

Jul 30, 2020 at 10:37am

againtry (2313)

This might give you a small start to getting some output from the 2 files for subsequent line-parsing/ignoring/processing:

#include <iostream>
#include <iomanip>
#include <fstream>

int main()
{
    // ACCESS THE 2 FILES
    std::ifstream File1("file1.txt");
    if(!File1.is_open())
    {
        std::cout << "file 1 bombed ... at least\n";
        return EXIT_FAILURE;
    }
    
    std::ifstream File2("file2.txt");
    if(!File2.is_open())
    {
        std::cout << "file 2 bombed ... but file 2 must be open\n";
        return EXIT_FAILURE;
    }
    
    // START TO PROCESS THE FILES
    std::string line_from_File1, line_from_File2;
    while( std::getline(File1, line_from_File1) )
    {
        std::getline(File2, line_from_File2);
        std::cout
        << "FILE1: " << line_from_File1 << '\n'
        << "FILE2: " << line_from_File2 << "\n\n";
    }
    
    return 0;
}

Edit & run on cpp.sh

FILE1: 14/07/20 18:27:44:410 ALT 3
FILE2: 14/07/20 18:26:43:409 ALT 3

FILE1: 14/07/20 18:27:44:411 C: ALT 0x00 Ok
FILE2: 14/07/20 18:26:43:410 C: AULT 0x00 Ok

FILE1: 14/07/20 18:27:44:411 ACXT 0 PTE
FILE2: 14/07/20 18:26:43:410 ACXT 0 PTE

Program ended with exit code: 0

Jul 30, 2020 at 11:27am

salem c (3715)

Unless this is specific "you must use C++ for this homework" kind of exercise, this kind of job is best suited to your favourite text processing language (perl/python/sed/awk....)

Eg.

$ xsel
14/07/20 18:27:44:410 ALT 3
14/07/20 18:27:44:411 C: ALT 0x00 Ok
14/07/20 18:27:44:411 ACXT 0 PTE
$ xsel | cut -d' ' -f3-
ALT 3
C: ALT 0x00 Ok
ACXT 0 PTE

Jul 30, 2020 at 12:08pm

jonnin (11494)

C-strings trivialize this as well. One of the cases where string chokes a bit.
rough pseudo code:

char linebuf[1000] = {0};
cp line = &(linebuff[22]); //22 is the first valid letter after fixed width timestamp
file.getline(linebuf, ..);
strstr(line, do same for other file put here);
note that we use line, not linebuff, in the compare, which basically substrings off the part you want without any extra processing.

note that this version will trigger on whitespace differences. Note that it only works if the timestamp is a fixed width on the front of the lines; other formats need something with a little less knot cutting and a little more finesse.

Last edited on Jul 30, 2020 at 12:12pm

Aug 1, 2020 at 3:48pm

CodeImpulse (10)

I am trying to use sscanf function using for loops to skip the timestamps and read rest of the string. But I am not getting desired results. Any help will be appreciated.
14/07/20 18:27:44:411 C: ALT 0x00 Ok

Aug 1, 2020 at 4:46pm

salem c (3715)

Erm, you were reading into a std::string, and now you want to use the C sscanf function?

sscanf(buff,"%*s %*s"
This will skip the first two fields.

Aug 1, 2020 at 5:13pm

CodeImpulse (10)

Thanks Salem. My problem is that I have to skip the timestamps of the strings of varying length e.g.
14/07/20 18:27:02:533 ... C: SPLT 0x00 MTE_DL_RBSFN_STAT 0X00000001 0X00000000 0X00000000 0X00000000 0X00000000 0X00000000 0X00000000 0X00000000 0X00000000 0X00000000 0X00000000 0X00000000 0X00000000 0X00000000
14/07/20 18:27:02:533 ... SLOF
14/07/20 18:27:02:537 ... C: SLOF 0x00
14/07/20 18:27:02:541 forg l1 SetPortMapping 0
14/07/20 18:27:02:550 C: FORG 0x00 Ok l1 SetPortMapping 0
So I need to skip the timestamps, but read the rest of the string. This will happen in a loop. So I can not use defined %s or %*s. I need an algorithm to skip timestamps and read strings of different lengths.
I tried to achieve it using for loops to ignore the stamps and read rest. But it did not work the way I wanted.

Aug 1, 2020 at 6:27pm

jonnin (11494)

I gave you a very simple answer to that.
read into a char array, take a constant pointer offset into the array past the timestamp, compare those. Its simple, it works, so long as the timestamp is the first thing and is a fixed width, which so far, is what you show.

here is the core of what you need, you can wrap file I/O and pretties around it:

This operates off the use what you have approach, rather than use string for the sake of string. C-style strings trivialize this problem.


int main()
{
/*
File1
14/07/20 18:27:44:410 ALT 3
14/07/20 18:27:44:411 C: ALT 0x00 Ok
14/07/20 18:27:44:411 ACXT 0 PTE
File2
14/07/20 18:26:43:409 ALT 3
14/07/20 18:26:43:410 C: ALT 0x00 Ok
14/07/20 18:26:43:410 ACXT 0 PTE
*/

char buff1[1000] = "14/07/20 19:27:44:411 C: ALT 0x00 Ok"; //I changed timestamps so no 2 same
char buff2[1000] = "14/07/20 11:27:44:411 C: ALT 0x00 Ok";
char buff3[1000] = "14/07/20 13:27:44:411 C: ALT 0x00 not Ok";

char * line1 = &(buff1[22]); //22 skips the space past the timestamp and picks up from C forward.  you can adjust if needed.  
char * line2 = &(buff2[22]);
char * line3 = &(buff3[22]);

//you only need 2 buffers and 2 lines.  I have 3 to demo both ok and not ok result 
//without typing all the file stuff.

bool notok;
notok = strcmp(line1, line2);
if(notok) cout << "Not OK"; else cout << "OK";

cout << endl;

notok = strcmp(line1, line3);
if(notok) cout << "Not OK"; else cout << "OK";

}

to do it with string, maybe someone has a better way, but I think you have to copy the data extra times (I can't find a way to avoid copying the data eg substring or left/right type function or ??). But I don't do a lot of text processing, and am not a guru at it.

Last edited on Aug 1, 2020 at 6:43pm

Aug 1, 2020 at 6:42pm

CodeImpulse (10)

Thank you jonnin(7442). I need to take into account different time formats and different string lengths. And this is maximum width of all the time formats. I was using something like this to display the time format.
char str []="14/07/20 18:26:44:409 MULT 3";
char buf[20];

int i;

for(i = 0;str[i]>= '0' && str[i] <= '21';i++)
{
sscanf(str,"%s",buf);

printf("%s",buf);

}

return 0;
}

But result is wierd.

14/07/2014/07/20
Process returned 0 (0x0) execution time : 0.108 s
Press any key to continue.

Aug 1, 2020 at 7:59pm

againtry (2313)

@CodeImpulse

Until you have a clear picture of what you are trying to do then it’s virtually impossible to help you.

So far you have conveyed the following:

1. You are sorting on a multi-format string you call a timestamp
2. You have 2 files
3. The timestamp is to be ignored in one of them
4. Which one is anybody’s guess
5. Sometimes it doesn’t work the way you want it.
6. Other times the results are weird
7. And so it goes towards the inevitable large hole to doom

I wonder whether you are at all serious about solving your problem. This has all the earmarks of a scrap/redo after you make it clear to yourself, in particular, wtf you are trying to do.

Unless this is some sort of legacy code I among the majority of others suggest you write in C++ instead of C. C here is doing you no good as my starter with just a couple of lines shows. Instead of solving the problem, so far with C, you’re just feeding it, sad to say :(

Aug 1, 2020 at 8:45pm

CodeImpulse (10)

@againtry
I have already solved this problem by removing the timestamps using C++. Now I want to solve this problem by skipping the timestamps rather than removing them. These files are just sample files. Actual files have hundreds of lines and it is not possible to know the number of lines because they are generated automatically by system.

while(getline(File1,inFile1)) // This will read my first file in which timestamps change
while(getline(File2,inFile2)) // This will read my second file with constant time stamps.
Here I want to skip the timestamps of second file for comparison with first file.
if(inFile1.find(inFile2) !=std::string::npos) // This will do the comparison by comparing the contents of the files.

Also width of the timestamps also changes. I hope everything is clear now. May be sscanf solves the problem. I am not sure.

Aug 1, 2020 at 9:05pm

jonnin (11494)

Read one string, read one line alternating will work if its still at the front of the line, but variable length. you need to read up on how to mix getlines with cins or similar C statements; you can bork the stream if you are not mindful.

Aug 1, 2020 at 9:42pm

lastchance (6980)

#include <iostream>
#include <string>
#include <sstream>
#include <fstream>
using namespace std;

struct Item
{
   string date;
   string time;
   string value;
};

istream & operator >> ( istream &in, Item &item )
{ 
   in >> item.date >> item.time;
   getline( in >> ws, item.value );
   return in;
}


int main()
{
// ifstream in1( "file1.txt" );
// ifstream in2( "file2.txt" );
   istringstream in1( "14/07/20 18:27:44:410 ALT 3\n"
                      "14/07/20 18:27:44:411 C: ALT 0x00 Ok\n"
                      "14/07/20 18:27:44:411 ACXT 0 PTE\n" );
   istringstream in2( "14/07/20 18:26:43:409 ALT 3\n"
                      "14/07/20 18:26:43:410 C: ALT 0x00 Ok\n"
                      "14/07/20 18:26:43:410 ABCD E FGH\n" );

   for ( Item item1, item2; in1 >> item1 && in2 >> item2; )
   {
       cout << "Read from file 1: " << item1.date << " " << item1.time << " " << item1.value << '\n';
       cout << "Read from file 2: " << item2.date << " " << item2.time << " " << item2.value << '\n';
       cout << "Values " << ( item1.value == item2.value ? "are" : "are not" ) << " the same.\n\n";
   }
}

Edit & run on cpp.sh

Read from file 1: 14/07/20 18:27:44:410 ALT 3
Read from file 2: 14/07/20 18:26:43:409 ALT 3
Values are the same.

Read from file 1: 14/07/20 18:27:44:411 C: ALT 0x00 Ok
Read from file 2: 14/07/20 18:26:43:410 C: ALT 0x00 Ok
Values are the same.

Read from file 1: 14/07/20 18:27:44:411 ACXT 0 PTE
Read from file 2: 14/07/20 18:26:43:410 ABCD E FGH
Values are not the same.

Last edited on Aug 2, 2020 at 5:31am

Aug 1, 2020 at 10:37pm

againtry (2313)

@CodeImpulse

Unfortunately for you you are making a fundamental error in problem-solving this. i.e. if you can’t describe it then you won’t be able to describe it, harsh as that sounds.

At the moment you are focussing on what code to use but not what it is you expect it to do. Have you any sort of a plan or is it a series of shots in the dark and you’ll know you’ve got there when you see it? Good luck with that approach.

The number of lines in each is basically irrelevant. While loops through the files get over that. To exaggerate, like the movie that’s problem 275 and we’re stuck here at problem 2.

So,

1. Each line of each file has (maybe, so far only you know for sure) 3 separate pieces of data - a date and time both of which can be read as <strings>, followed by another string of stuff which includes spaces

2. The string of stuff, let’s call it Stuff, is everything after the date and time strings to the end of the line.

So now, wtf is supposed to happen next?

3. Read the Stuff in file 1
4. Find out whether that Stuff is in file 2
5. Do something, who knows what when it’s found or not ...

Only you know ... if that ... the rest is mechanics when you do :)

Aug 2, 2020 at 2:52am

againtry (2313)

@CodeImpulse Here's the getline() version you alluded to earlier but using C++ and reading directly from your files.

Any or all of the extracted data from either of the two files or both can be processed via the container once you've decided what that process is.

The value of a <map> - 'dictionary container' - is rapid searching but there are plenty of others to choose from <vector>'s, <set>'s , even the dreaded C-style arrays

#include <iostream>
#include <iomanip>
#include <fstream>
#include <map>

int main()
{
    // ACCESS THE 2 FILES
    std::ifstream File1("file1.txt");
    if(!File1.is_open())
    {
        std::cout << "file 1 bombed ... at least\n";
        return EXIT_FAILURE;
    }
    
    std::ifstream File2("file2.txt");
    if(!File2.is_open())
    {
        std::cout << "file 2 bombed ... but file 2 must be open\n";
        return EXIT_FAILURE;
    }
    
    // START TO PROCESS THE FILES
    std::cout << "FROM FILE 1:\n";
    std::string stuff_1_1, stuff_1_2, stuff_1_3;
    
    while
        (
         File1 >> stuff_1_1 >> stuff_1_2 and
         std::getline(File1, stuff_1_3)
         )
    {
        
        std::cout
        << "Date: " << stuff_1_1 << " Time stamp: "
        << stuff_1_2 << " Stuff: "
        << stuff_1_3 << '\n';
    }
    std::cout << '\n';
    
    std::cout << "FROM FILE 2:\n";
    std::string dummy, stuff_2_3;
    while( File2 >> dummy >> dummy and std::getline(File2, stuff_2_3) )
    {
        std::cout << "Stuff: " << stuff_2_3 << '\n';
    }
    
    return 0;
}

Edit & run on cpp.sh

FROM FILE 1:
Date: 14/07/20 Time stamp: 18:27:44:410 Stuff: ALT 3
Date: 14/07/20 Time stamp: 18:27:44:411 Stuff: C: ALT 0x00 Ok
Date: 14/07/20 Time stamp: 18:27:44:411 Stuff: ACXT 0 PTE

FROM FILE 2:
Stuff: ALT 3
Stuff: C: AULT 0x00 Ok
Stuff: ACXT 0 PTE
Program ended with exit code: 0

Last edited on Aug 2, 2020 at 3:29am

Aug 2, 2020 at 3:08am

againtry (2313)

PS Depending what you decide is the process the search in File1 (say) for the Stuff in File2, if that is what u are still doing, will indicate what the map key might be.

Note also that the map is built as File1 is read, and in a separate while loop the File2 Stuff is processed against what is stored in the <map>

Aug 17, 2020 at 10:55am

CodeImpulse (10)

I solved this problem. Thanks for prompt replies. Will get back to you for other issues if need be.

Aug 17, 2020 at 12:40pm

dhayden (5799)

Learn and love the command line utilities to pre-process your strings. A slight variation on SalemC's first suggestion to remove the timestamps:
cut -c21-

Topic archived. No new replies allowed.

C++

Forum

Ignoring time stamps