Best file and record structure for a file

I am 64 and started learning c++. In the late 1970s I wrote several programs used commercially in compiled BASIC. I created an input routine that got a character at a time and found the ASCII so I could accept or reject the character. I converted each field into a string and wrote fixed length fields into a record of fixed length. Different files had different keys (some multiple) which sorted in ascending order. Binary search found keys in less the 10 tries.

The programs I wrote were Pharmacy management, Job Costing with 6 variances, Radio ad scheduling. Time Share management, and mortgage management - so I'am "OK" with systems analysis. What I would like to know is there a standard method for field I/O and a standard method for file I/O to a file.

I don't want to head down a read that does make sense. My first project will be a card file with name, address, DoB, hobbies, and relationships, etc. to other in the file.

I have write probably 30 small programs in C++, starting with Hello World

My e-mail is parsoneabaohe@gmail.com
Last edited on
If you're just starting with C\C++ then I would suggest that you keep it simple, tab delimited and comma delimited files are still popular due to their simplicity. They can both be interpreted natively with Libre\Open Office or MS Office so that you can check your results. Given your experience I would guess that combined they would occupy you for all of about 15 mins but it will help you get the hang of file I\O on your platform. Next might be XML which is a popular and slightly more complex standard to try your hand at. Anything after that like SOAP (which uses XML) or MySQL would require an hour or two segway for you to set up a server and\or VM.
Last edited on
I am also just learning c++ having previously only used BASIC and C.
Are you planning to write commercial apps?
Any reason to take on the difficult task of learning c++ instead of visual basic?
There are probably examples on the internet for the task you want to undertake.
Here is some stuff on files,
http://www.cplusplus.com/doc/tutorial/files/
Last edited on
As XML has come up I thought I should mention JSON and YAML, other alternative text-based storage formats. Not so popular as XML but more readable. All have libraries to hande them.

JSON
http://json.org/

YAML
http://yaml.org/

So saying, I am a fan of XML even though it is more verbose. It's just that I try not to automatically use a given technology.

But I'm not sure SOAP is so relevant in this context. Isn't that an XML-based protocol for communicating with web servers? (That is, sending commands and receiving responses.)

And for a small app, SQLite might be worth considering if you want to use SQL without needing to set up a server.

SQLite
https://www.sqlite.org/
SQLite is a software library that implements a self-contained, serverless, zero-configuration, transactional SQL database engine. SQLite is the most widely deployed database engine in the world.


Andy

PS Seqway? A ride on an electric scooter for use in pedestrian areas?
Last edited on
Yes, SOAP is a protocol that uses XML, I was trying to think of ways that he could illustrate that he is using the protocol correctly. It is probably a lot less useful of a suggestion than I originally thought.

Evidently I meant segue, as in a musical transition. See, this is what I get for trying to sound smart :p.
Sigh. This is why computers are about a million times faster today than in the 70's and yet they don't solve problems any faster.

If you want to do binary searches on the files then don't use tab delimited or comma delimited or XML or .... Do fixed lengthed records like you did before. The trick here is that you'll need to use character arrays instead of the more modern C++ strings in your class.

Be sure to open your files in binary mode.

If this will be something complex or if the records are variable length then definitely consider SQLlite.
So if I was going to write a card file, I'd probably save the data in txt format.

As a text file, you can import into a database or excel fairly easy if everything is on one line and thus make it more flexible for later use.
I'd use a special char that's not going to be used in your data, so
. / \ - ; : @ # $
are probably not good choices.
I'd personally use
~ or |
.

So I'd make my data file look like
First name | Last name | DOB | address | City | State | Zip | and on til the end.
this is what I get for trying to sound smart

Ah ha -- that (seque) makes more sense. Segway (or segue) isn't a word I use so I had to look it up and got confused.

I'd personally use
~ or |

If I was going to use a delimiter I would prob use tab (one of Computergeek01's suggestions) as tab-delimited files are one of Excel and Calc's standard formats.

But the handling of hobbies and relationships might be better done with a hierarchical format.

Andy
When you feel the urge to design a complex binary file format, or a complex binary application protocol, it is generally wise to lie down until the feeling passes.
ESR in 'The Art of Unix Programming' http://catb.org/~esr/writings/taoup/html/ch05s01.html

Keep the file format simple, textual, extensible, transparent, searchable using standard tools(eg.grep). For instance:
card_number: 1
name: <name>
address: <address>
dob: <date of birth>
hobby: <hobby one>
hobby: <hobby two>
...
<a blank line at the end to signify end of information for this card>
card_number: 2
name: <name>
address: <address>
dob: <date of birth>
hobby: <hobby one>
hobby: <hobby two>
hobby: <hobby three>
...
<a blank line at the end to signify end of information for this card>
card_number: 3
... 


Read the information from the file into a sequence (std::vector or std::map) and operate on the data in memory. At the end, or when required, write the modified data back to the file.

To paraphrase ESR:
When you feel the urge to design a binary file format, and perform operations like binary searches using stream offsets in the file, it is generally wise to lie down until the feeling passes.
Thanks for every ones responses. I am in the process of doing several things. After deciding to start with variable declaration, variable assignment, and variable I/O, I went to arrays and accessing cells, and then wrote simple program like temperature conversion for C to F and f to C.

When I was writing BASIC, I wrote a routine to get one character at a times from the keyboard, take the ASCII to determine what was input. I defined the character set I wanted to allow 0 to 9 (ASCII 48 to 57), CAPS A to Z (ASCII 65 to 90), lower case a to z (ASCII 97 to 122), comma (ASCII 44), and dash (ASCII 150). At the beginning of the program I defined 4 arrays - field length, vertical position, horizontal position, and pointer to next position at EOF. Each field had (vertical, horizontal, and length) in different arrays, each field had the same cell position in the different arrays. So given this, it did not take me long to write a multiple page application. The input routine wasted space because each file was a string padded to the end with spaces because fields were defined with a fixed length, which led to files with records of fixed length, Even numbers were input as string, then a VAL function was used to turn the string into a number, calculations were done and then back into a string.

The coding was not so important as the systems analysis, for example, I wrote a DOM(s) program (doctor's office Management), which was nothing more than a patient profile (names [1st middle, and last], address, insurance info, Medicare info, account balances [current, over 30, 60, and 90], etc.) , appointment scheduling program, diagnostic list program (diagnostic code and description), a procedural list program (procedural code and description), accounts receivable for patients, a forms program. I also wrote a pharmacy and job cost (with variances) program.

Your input is very important for the following reason. I decided to learn the following languages - C++, Java, and Python, so when I would write these small programs, I would write one in C++, Java, and Python all one after another in order to compare languages and learn quicker

Here is the meat of my problem, I did not even know what an IDE was, I knew what GUI stood for because of DOS 3.3 and Windows evolution, but did not realize there were GUI(s) for different languages. For example, I did not realize Python came with TKinter as a part of its standard library, Java had Swing, and C++ had Windows Developer ( I think I got that right). QUESTION - WHEN CODE IS WRITTEN COMMERICALLY or for igh-end IS MOST OF IT HARD-CODED OR ARE PRESENT DAY PROGRAMMERS USING TKinter, Swing, and Eclispe?

Thanks again - I will try not to bug you guys again. Hope this question is not vague and it is a good valid question.

It is one thing to learn a language and a completely different thing to learn a language and it's associated GUI.

I think almost all code is written with the help of some sort of IDE these days. I write code on UNIX systems, and even though I primarily use a text editor and a compiler, the two are linked together so that, for example, the editor can parse the compiler output and take me right to the syntax errors. Pretty much any code that has a GUI interface is written with an IDE.
Topic archived. No new replies allowed.