Network I/O
Whether it's loading a webpage, sending a message that goes unanswered, or streaming a cat video, the network is at work behind the scenes. In essence, a network enables data transfer from one place to another, no matter the distance. At a technical level, this is achieved through two sockets communicating with each other. In this post, we'll explore how it works and implement a basic client-server communication using TCP sockets.
File Descriptor
A file descriptor, or fd
, is a fundamental concept for managing input/output (I/O) operations in Unix-like operating systems. It serves as an abstract handle through which processes interact with files, devices, sockets, and other I/O resources. A file descriptor is a small, non-negative integer. When a file or resource is opened, the kernel assigns a unique file descriptor to represent it for the duration of the program's execution. By convention, file descriptor 0, 1, and 2 are reserved for standard input, standard output, and standard error, respectively. The following shows a program that reads a file using the open
, read
, write
, and close
system calls, which work with fd
, and then reads the CMakeLists.txt file.
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
void read_file(char *path) {
int fd = open(path, O_RDONLY); // open in read-only mode
if (fd == -1) { // open returns -1 on failure
perror(path);
return;
}
int nread;
char buf[1024];
// read into buffer until EOF
while ((nread = read(fd, buf, sizeof(buf))) > 0)
write(1, buf, nread); // write buffer to stdout
close(fd); // deallocate fd
}
int main(int argc, char **argv) {
if (argc < 2) {
printf("Usage: read_file FILE\n");
return -1;
}
read_file(argv[1]);
}
Sockets
Network communication, with rare exceptions, relies on sockets. A socket is identified by an IP address and port number, directing data to specific applications. Sockets use protocols like TCP for reliable transmission or UDP for faster, less reliable communication. With basic socket programming and C APIs, you can build almost any networked application. Higher-level protocols like HTTP can be built on top of low-level socket communication over TCP.
On the client side:
-
socket
: Creates a file descriptor (let's call itfd1
), similar to howopen
works. At this point, the socket is created, but we can't send or receive data yet. -
connect
: Usesfd1
, along with the IP address and port number that identify the server's socket. The connection is established if the server accepts it. -
send
andrecv
: Similar towrite
andread
, these functions send or receive a specified number of bytes. Note that whensend
succeeds, it doesn't guarantee the server has received the data.send
only submits the bytes to the OS for delivery. The same applies torecv
. A socket connection is bidirectional, meaning both the client and server can send and receive data. -
close
: Given a connectedfd1
,close
tells the kernel to close the connection. The kernel will flush any buffered data and send a specialEOF
message. Both the client and server can initiate the closure of the connection.
On the server side:
-
socket
: Creates a file descriptor (let's call itfd2
), just like on the client side. Thisfd2
represents the server-side socket. -
bind
: Associatesfd2
with a specific IP address and port number. This essentially reserves the IP and port so only this process can send and receive data through them. -
listen
: Marks the bound file descriptor (fd2
) as "listening," signaling to the operating system that the server is ready to accept incoming connections. This step essentially declares the server "open for business." -
accept
: Given a bound and listening file descriptor (fd2
),accept
creates a new socket, which results in a new file descriptor (e.g.fd3
). The newfd3
is used exclusively for communication with a specific client viasend
andrecv
.
Note
The original fd2
on the server is solely responsible for handling incoming connections
and is not used for sending or receiving data. For every new client connection,
a new fd
(e.g., fd3
, fd4
, etc.) is created to facilitate communication.
Handling multiple clients simultaneously requires managing these additional file descriptors.
#include <stdio.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <netinet/in.h>
int main() {
// Create a TCP socket.
// domain: AF_INET, specifies IPv4
// type: SOCK_STREAM, specifies a TCP socket (as opposed to SOCK_DGRAM for UDP)
// protocol: 0, uses the default protocol for the given type (TCP in this case)
int fd = socket(AF_INET, SOCK_STREAM, 0);
if (fd == -1) {
// failed to create a socket
perror("socket");
return -1;
}
// connect to another socket
// specify an address for the other socket
struct sockaddr_in server_address;
server_address.sin_family = AF_INET; // the same one we used for socket type above
server_address.sin_port = htons(8001); // the server port converted to the required format
server_address.sin_addr.s_addr = htonl(INADDR_ANY); // the converted server ip. any ip on local machine here '0.0.0.0'
// cast the address and connect
// the status will be 0 on success
int status = connect(fd, (struct sockaddr *) &server_address, sizeof(server_address));
if (status == -1) {
perror("socket connection");
return -1;
}
// receive some data from the server
char buffer[512];
recv(fd, &buffer, sizeof(buffer), 0); // recv up to 512 bytes. 0 is for the optional flag.
// print out the data that we get back from the server
printf("received: %s\n", buffer);
// close the socket
close(fd);
return 0;
}
#include <stdio.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <netinet/in.h>
int main() {
// the message to send
char message[512] = "hello from tcp server";
// Create a TCP socket.
// domain: AF_INET, specifies IPv4
// type: SOCK_STREAM, specifies a TCP socket (as opposed to SOCK_DGRAM for UDP)
// protocol: 0, uses the default protocol for the given type (TCP in this case)
int fd = socket(AF_INET, SOCK_STREAM, 0); // the server socket
if (fd == -1) {
// failed to create a socket
perror("socket");
return -1;
}
// define the server address
struct sockaddr_in server_address;
server_address.sin_family = AF_INET; // the same one we used for socket type above
server_address.sin_port = htons(8001); // the server port converted to the required format
server_address.sin_addr.s_addr = htonl(INADDR_ANY); // the converted server ip. any ip on local machine here '0.0.0.0'
// bind the socket to the address (ip + port)
int status = bind(fd, (struct sockaddr *) &server_address, sizeof(server_address));
if (status == -1) {
perror("bind socket");
return -1;
}
// listen for connection
// listen on the socket with a backlog of 10
// the backlog specifies the maximum number of queued connections
listen(fd, 10);
// the client's address and its length will be filled by accept
// or you can pass NULL to ignore: accept(fd, NULL, NULL)
struct sockaddr_in client_address;
socklen_t client_len = sizeof(client_address);
// accept a connection, which creates a new socket
// a new fd for every accepted connection
int client_socket = accept(fd, (struct sockaddr *) &client_address, &client_len);
// send the message
send(client_socket, message, sizeof(message), 0); // 0 is for the optional flag that we don't need
// close the server socket
close(fd);
return 0;
}
Concurrent Connections
The above demonstrates handling a single client request by a TCP server. In real-world scenarios, servers often need to handle multiple simultaneous connections. One solution is to create a separate thread for each connection.
#include <stdio.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <netinet/in.h>
int main() {
// Create a TCP socket.
// domain: AF_INET, specifies IPv4
// type: SOCK_STREAM, specifies a TCP socket (as opposed to SOCK_DGRAM for UDP)
// protocol: 0, uses the default protocol for the given type (TCP in this case)
int fd = socket(AF_INET, SOCK_STREAM, 0);
if (fd == -1) {
// failed to create a socket
perror("socket");
return -1;
}
// connect to another socket
// specify an address for the other socket
struct sockaddr_in server_address;
server_address.sin_family = AF_INET; // the same one we used for socket type above
server_address.sin_port = htons(8001); // the server port converted to the required format
server_address.sin_addr.s_addr = htonl(INADDR_ANY); // the converted server ip. any ip on local machine here '0.0.0.0'
// cast the address and connect
// the status will be 0 on success
int status = connect(fd, (struct sockaddr *) &server_address, sizeof(server_address));
if (status == -1) {
perror("socket connection");
return -1;
}
char buffer[512];
int nread;
// read from stdin and send it
while ((nread = read(STDIN_FILENO, buffer, sizeof(buffer))) > 0) {
printf("sending: %s\n", buffer);
int count = send(fd, &buffer, sizeof(buffer), 0); // send
printf("sent %d bytes\n", count);
count = recv(fd, &buffer, sizeof(buffer), 0); // receive
printf("received %d bytes: %s\n", count, buffer);
}
// close the socket
printf("closing connection\n");
close(fd);
return 0;
}
#include <stdio.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <netinet/in.h>
#include <pthread.h>
void *handle_connection(void *arg) {
int client_socket = *(int *)arg;
// recv and echo back until EOF (close)
while (1) {
char buffer[512] = {0};
int count = recv(client_socket, buffer, sizeof(buffer), 0);
if (count == 0) {
// received EOF
printf("closing connection: %d\n", client_socket);
close(client_socket);
break;
}
printf("[%d] received: %s\n", client_socket, buffer);
// send it back
count = send(client_socket, buffer, sizeof(buffer), 0);
printf("[%d] send: %d\n", client_socket, count);
}
return NULL;
}
int main() {
// the message to send
char message[512] = "hello from tcp server";
// Create a TCP socket.
// domain: AF_INET, specifies IPv4
// type: SOCK_STREAM, specifies a TCP socket (as opposed to SOCK_DGRAM for UDP)
// protocol: 0, uses the default protocol for the given type (TCP in this case)
int fd = socket(AF_INET, SOCK_STREAM, 0); // the server socket
if (fd == -1) {
// failed to create a socket
perror("socket");
return -1;
}
// define the server address
struct sockaddr_in server_address;
server_address.sin_family = AF_INET; // the same one we used for socket type above
server_address.sin_port = htons(8001); // the server port converted to the required format
server_address.sin_addr.s_addr = htonl(INADDR_ANY); // the converted server ip. any ip on local machine here '0.0.0.0'
// bind the socket to the address (ip + port)
int status = bind(fd, (struct sockaddr *) &server_address, sizeof(server_address));
if (status == -1) {
perror("bind socket");
return -1;
}
// listen for connection
// listen on the socket with a backlog of 10
// the backlog specifies the maximum number of queued connections
listen(fd, 10);
// the client's address and its length will be filled by accept
// or you can pass NULL to ignore: accept(fd, NULL, NULL)
struct sockaddr_in client_address;
socklen_t client_len = sizeof(client_address);
while (1) {
// accept a connection, which creates a new socket
// a new fd for every accepted connection
int client_socket = accept(fd, (struct sockaddr *) &client_address, &client_len);
printf("connection accepted with fd %d\n", client_socket);
// handle the connection in a dedicated thread
// use default attributes and pass client_socket as argument
pthread_t thread;
pthread_create(&thread, NULL, handle_connection, &client_socket);
}
// send the message
// send(client_socket, message, sizeof(message), 0); // 0 is for the optional flag that we don't need
// close the server socket
close(fd);
return 0;
}
While this approach is a good starting point for managing concurrency, it is not efficient for large-scale applications. In future posts, we will explore alternatives like thread pools or event-driven models for more efficient handling of concurrent connections.