C Read One Character at a Time From Socket
If you have whatsoever questions or comments,
delight visit us on the Forums.
This particular was added on: 2005/02/12
- Introduction
- Sample of incorrect lawmaking
- Mistake one: Oh no, overflow...!
- Possible Mistake 2: Where did that extra byte come up from?
- Error 3: Did I read you right?
- Stream Rules
- Some diagrams!
- Solution one - Utilize a delimiter
- Solution 2 - Utilise a data length indicator
- Solution iii - Fixed sized letters
- Interruption up you lot code!
Introduction
When people create their outset TCP/IP socket based plan, they often don't realise what a mine field they're stepping into. There's a big spring involved in getting from a bones "howdy world" manner socket program to something that is actually of whatever real use. In the haste to build something bigger, better and more than fun, information technology is like shooting fish in a barrel to overlook the basic steps needed to permit successful exchange of data through a socket. Here nosotros discuss some of these bug in detail, in hope that the reader tin can avoid making wrong assumptions from the showtime.
This particular article covers the princples of controlling the receiving buffer. You'll see a classic buffer overflow.
Sample of incorrect code
The following is an excerpt from a simple socket program that receives data. It shows a few mutual mistakes:
/* Sample1.c */ 01 char Response[] = "Control OK"; 02 char CommandBuffer[BUFSIZ]; 03 04 nBytes = recv(socket, CommandBuffer, sizeof(CommandBuffer), 0); 05 06 if (nBytes == -1) 07 { 08 /* 09 * Socket in mistake country 10 */ eleven perror ("recv"); 12 render 0; 13 } 14 15 if (nBytes == 0) 16 { 17 /* 18 * Socket has been closed 19 */ 20 fprintf (stderr, "Socket %d closed", socket); 21 close (socket); 22 return 0; 23 } 24 25 /* 26 * Command read OK, let's process it! 27 */ 28 29 CommandBuffer[nBytes] = '\0'; 30 31 if (strcmp (CommandBuffer, "QUIT") == 0) 32 { 33 printf ("Remote program said QUIT!\north"); 34 send(socket, Response, sizeof(Response), 0); 35 } 36 37 /* 38 * and so on.... 39 */ Using this code, I'll highlight three bug. One minor, and two major (merely not in that society!).
Mistake 1: Oh no, overflow...!
On line 04 of Sample1.c, recv() is asked to fill CommandBuffer with upto sizeof(CommandBuffer) bytes. Let'south presume that it does so successfully, and the resulting count is stored in nBytes. Having got by the subsequent conditional statements, line 29 applies a \0 character to the buffer to null terminate ready for utilise as a string. This is a serious error, nosotros've just written to retention nosotros weren't supposed to; it's a archetype buffer overflow. Effectively, we've simply done this:
char CommandBuffer[512]; CommandBuffer[512] = '\0'; /* Oops, this is ane byte passed the end of the array */
If you're expecting to
recv() a string, I'd suggest giving the recv() office arraysize-1 bytes to write to, something similar:nBytes = recv(socket, CommandBuffer, sizeof(CommandBuffer) - 1, 0);
That style,
nBytes can exist used to safely apply the \0 grapheme, as in the original lawmaking.Possible mistake 2: Where did that extra byte come up from?
We tin can run into that the code above expects the data it receives to be suitable for use as a cord, with the exception that it will cipher terminate the array itself. This means it does non expect to receive a \0 character in the data from the socket. It would therefore exist reasonable to assume that the application at the other end expects the aforementioned. On line 34 of Sample1.c, the send() function returns a string denoting that a command was accepted, but this code sends the cipher terminator as well. This is because sizeof(Response) is used, which will yield the length of COMMAND OK\0. A meliorate selection would have been to use strlen() to determine the length.
This may not be a problem; the other end may be able to cope with or without a \0 but, at the very to the lowest degree, our pattern should be consistent, and we should be fully enlightened of what nosotros're sending. Hence I labelled this section a "Possible error" .
In fact, if you lot're only moving strings around, the extra \0 is unlikely to cause problems, merely if you showtime moving more circuitous data structures, then you definitely need to be more careful.
Mistake iii: Did I read yous correct?
This concluding mistake it a little more complicated than the first two, and will take a lot to ready, but information technology is something that must exist washed. To summarise the problem: you can never be sure most how much data the application will receive when it calls recv(). Merely because you think the other cease might send you one of a preset number of one-give-and-take commands, e.g. QUIT, doesn't mean that's what y'all'll recv(). To fully appreciate the dilemma, you need to start understand that TCP/IP is a "stream" based protocol. Let's talk over that chip first...
- Information is delivered every bit a series of bytes that volition make it at the target application in the gild they were sent.
- Data arrives equally and when everything in between the two applications feels like delivering it.
- Information can be split into multiple packets, dependant on lots of things that are mostly out of your command.
- send()ing multiple messages does not guarantee that you'll recv() the aforementioned number of messages.
- The receiving application must cater for split messages.
- The receiving application must cater for joined messages.
- There is no automatic magic marking at the start of a bulletin, nor at the the stop of a message.
The key thing to notation is that messages can be split up or joined, and yep, you volition probably take to do something about re-assembling them on the receiving end. This is where Sample1.c has failed; information technology makes no attempt to ensure that it has received a single, complete control before processing it.
Let's look at some samples of what could happen when you call recv() to get the control in Sample1.c:
Scenario i: One command, 1 recv(). In this case, ane call to recv() gets one command. Nice and uncomplicated! +-------------+ | recv() 1 | +-------------+ |USER MYNAME\0| +-------------+ Scenario two: One command, two recv()s. We need ii calls to recv(), and we must re-get together the data. +--------+--------+ |recv() 1|recv() 2| +--------+--------+ |USER MYN|AME\0 | +--------+--------+ Scenario iii: 2 commands, ane recv(). We demand i call to recv(), and we must split the data, in order to process both commands +------------------------------+ | recv() 1 | +------------------------------+ |USER MYNAME\0PASSWORD MYPASS\0| +------------------------------+ Scenario four: Two commands, ii recv()southward. We need two calls to recv(), nosotros must carve up and re-assemble data. +------------------+------------+ | recv() 1 | recv() two | +------------------+------------+ |USER MYNAME\0PASSW|ORD MYPASS\0| +------------------+------------+ In that location are other scenarios, including "no data available" and "error weather condition", which nosotros will non encompass hither.
Equally you tin can see, the streaming protocol is quite ruthless with your information. It will chop and join wherever it feels like; it's upwardly to you to gear up it! Now we movement to the fun part, how to actually do that prepare...
There are three solutions on offer here, none of which come with full lawmaking; that is left every bit an exercise for the reader.
Solution 1 - Use a delimiter
When processing only strings, equally in Sample1.c, y'all can utilise a delimiter byte to break upwards messages. A good choice would exist the \0 character that terminates all C strings. The recv()ing program can behave in one of two means:
1) Read a single byte at a time..
... until it hits the \0 character, at which bespeak it can assume information technology has received a complete command. This selection is squeamish and elementary, only does come with an overhead of repeatedly calling recv(), which is inefficient.
ii) Read an arbitary number of bytes...
... and so parse the receiving buffer, looking for a \0 character. In one case found, pass the details back to the calling role. But we mustn't forget virtually the bytes in the buffer after the first string; they'll demand processing at some point, too.
Solution 2 - Use a data length indicator
Every message that is sent tin can be prefixed with a value that represents the information's length. The receiver starts by recv()ing a fixed number of bytes to get this length indicator and, once it has it, it recv()s that specific number of bytes. If all goes well, ii recv()s are all that are needed to read ane message. Of form, the information may exist split, meaning that you need two or more calls to get it... and don't forget this includes the getting of the length indicator in the beginning place!
Sample data stream, using a 4 byte length indicator: 0017This is a message0010So is this Psuedo Code: Image: int myrecv(void *buf, size_t max_buffer_size, size_t bytes_to_read); INDICATOR_LEN = iv rc = myrecv(buf, sizeof(buf), INDICATOR_LEN); if (rc != OK) Then leave(); UserDataLength = ConvertNumberFromString(buf, INDICATOR_LEN); if (UserDataLength Is Out_Of_Bounds) So exit(); UserData = malloc(UserDataLength); rc = myrecv(buf, sizeof(buf), UserDataLength); if (rc == UserDataLength) And then Nosotros Received A Complete Message!
Solution three - Fixed sized messages
This option is not really practical for strings, but if you're sending data structures around, they might well be of fixed size. In this case, recv()ing the sizeof(dataStructure) number of bytes would exist a good choice, plus including handling of separate packets.
Use functions
Another common problem is making functions as well long and complex. Remember, no matter how yous choose to read information from a socket, break your program upward into lots of minor functions that perform specific tasks. It will make direction of the code, and debugging, a lot easier.
Written by: Hammer
Source: https://faq.cprogramming.com/cgi-bin/smartfaq.cgi?id=1044780608&answer=1108255660
0 Response to "C Read One Character at a Time From Socket"
Post a Comment