Jun 23, 2015 at 9:07pm UTC
I need to parse multiple lines from a file so that I only get a substring from each line. I have my sample data and my rough algorithm below.
Sample Data:
Input: 05/29/2014 08:03 PM 6,385,700 Intro-to-Political-Science.pdf
Expected Output: Intro to Political Science
Input: 14,655,232 Google.com.Regular.Expression.2001.pdf
Expected Output: Regular Expression
Input: Bing.com_Cplusplus Programming, Dale and Weems.pdf
Expected Output: Cplusplus Programming
My algorithm so far
Declare filename as string
Declare lineinput as string
Prompt user to enter filename
Input filename
Get lineinput from filename
I am not sure after this what regular expression would give me my desired output? I read about regex_match and match_results, but I was not sure how to use it in here.
Last edited on Jun 23, 2015 at 9:07pm UTC
Jun 24, 2015 at 8:04am UTC
for a simple input like that regex seems to be somehow overdone.
What are the criteria for the sub string?
Jun 24, 2015 at 1:51pm UTC
My first thoughts were that it was too complicated for regex? And rather problematic in general, based on the limited information provided to far.
OK... here you somehow you spot that Bing.com is a web site and that Dale and Weems are the authors...
Input: Bing.com_Cplusplus Programming, Dale and Weems.pdf
Expected Output: Cplusplus Programming
but the other examples are just book titles, without authors, so how about:
Object-oriented Analysis, Design and Implementation: An Integrated Approach Paperback by Brahma Dathan, Sarnath Ramnath (dropping the subtitle...)
Input: Bing.com_Object-oriented Analysis, Design and Implementation.pdf
Expected Output: Object-oriented Analysis, Design and Implementation
rather that just Object-oriented Analysis
And how about:
Justinguitar.com Beginner's Songbook by Justin Sandercoe
Input: 05/29/2014 08:03 PM 6,385,700 Justinguitar.com Beginner's Songbook
Expected Output: Justinguitar.com Beginner's Songbook
how do you know to keep Justinguitar.com here but throw away Bing.com and Google.com elsewhere?
Andy
Last edited on Jun 24, 2015 at 1:54pm UTC