Introduction to OSI model and TCP/IP for Testers
Most applications out there run on the HTTP protocol, so having a solid understanding of this protocol will make your testing work much more manageable. We explored this in a previous post: What is HTTP protocol – introduction to HTTP for Testers. But there’s more to networks than just HTTP. In this post, we are going to dive deeper into networks by exploring the OSI model.
My main goal in this article is to show you the OSI model and explain how data flows in a network. Then I will go through the differences between the OSI model and TCP/IP. At the end of the article, I will also mention a few protocols used in networks.
But before we get into the details, I should explain some basic terminology.
LAN (Local Area Network) and WLAN (Wireless Local Area Network)
LAN is a local network that consists of a group of computers and devices connected via a single physical network (cables). It is limited to a specific geographic area/location.
An excellent example of this kind of network would be a library, office, or home. I don’t think most of us use a LAN in our homes these days, because a LAN connects devices via cables. Nowadays, our devices are connected wirelessly via WIFI, so we’re talking about WLAN.
WAN (Wide Area Network)
WAN combines numerous sites and covers large geographic regions (connecting physically distant locations). The best example of this is the internet itself – that is, thousands of local networks (LAN / WLAN) connected.
Another example would be connecting three company offices in different cities. Each office has its LAN. By combining them, we could create the company’s own internal network – WAN.
Differences between IP and MAC address
You have probably already heard of and know something about what an IP is. However, you may not have met the concept of a MAC address. So, let me explain in a few words what an IP is, and then a MAC address, to illustrate the key differences between them.
IP (internet protocol)
We use IP for communication between different networks (to address and transport data from one network to another). It performs the role of routing, i.e., searches for the fastest route to pass a data packet. An IP address is a logical address – this means that it is allocated depending on which network the device has been connected to. If a device is in two networks, it will have two IP addresses.
MAC address (Media Access Control)
MAC is a physical address with a unique identifier burned out on the network card. It identifies specific devices and is assigned by the manufacturer. MAC addresses are used for communication within one network, e.g., in a home network, if you want to connect a computer to a printer or other devices, it will use MAC addresses to do that.
Key differences to remember
Identifies connection with a device in the network
Identifies device in the network
Assigned by the network administrator or ISP (internet service provider)
Assigned by the manufacturer
Used in WAN communication
Used in LAN/WLAN communication
The OSI model has never been directly implemented as it’s mostly a reference architecture on how data should flow from one application to another through a network. TCP/IP is used, and these days it’s the most popular. After the OSI model, I will say more about TCP/IP. But it’s good to start with the OSI because it’s easier to understand some of the concepts.
The OSI model consists of 7 layers divided into two groups:
- Host layers (happening on the computer side. Responsible for accurate data delivery between devices)
- Media layers (happening on the network side. Responsible for making sure that the data has arrived at its destination)
7. Application layer
In this layer, the user directly interacts with applications. Here is decided which interfaces are used to interact with the network through the corresponding protocols in this layer.
Examples of such applications are chrome or Gmail:
- Chrome uses the HTTP / HTTPS protocol
- Gmail uses email protocols like SMTP, IMAP.
The applications themselves are not in the application layer – in this layer, there are only the protocols or services that the applications use.
6. Presentation layer
The task of this layer is proper data representation, compression/decompression, encryption/decryption. This ensures that the data sent from the X system application layer can be read by the Y system application layer.
5. Session layer
This layer is responsible for creating, managing, and then closing sessions between two applications that want to communicate with each other.
4. Transport layer
The task of this layer is to make sure that the data has arrived safely from the sender to the recipient. When it sends data, it breaks it into segments. When it accepts data, it puts it back into a stream of data.
In this layer two protocols are used: TCP and UDP (later on in the article I’ll be saying more about these)
3. Network layer
Provides addressing and routing services. It defines which routes connect individual computers and decides how much information to send using one connection or another. Data transferred through this layer are called packets.
Places two addresses in the packet sent:
- Source address
- Destination address
This layer is based on IP (internet protocol).
2. Data-link layer
This layer deals with packing data into frames and sending them to the physical layer. It also oversees the quality of the information provided by the physical layer. It recognizes errors related to losing packages and damaging frames and deals with their repair.
1. Physical layer
This is the physical aspect of the network. This applies to cables, network cards, WIFI, etc. It is only used to send logical zeros and ones (bits). It determines how fast the data flows. When this layer receives frames from the data link layer, it changes them to a bitstream.
Encapsulation and decapsulation of data
Encapsulation adds pieces of information to data sent over the network. This occurs when we send data. At each layer, some information is added to our data. We combine the address of the sender and recipient, the encryption method, data format, how the data will be divided, sent, etc.
Decapsulation occurs when we receive information. It consists of removing pieces of information collected from the network. At each layer, some info disappears. In the end, the user only gets what he should receive without the IP, MAC address, etc.
Differences between the OSI model and TCP/IP
The TCP/IP model has a similar organization of layers to the OSI model. However, TCP/IP is not as rigorously divided and better reflects the actual structure of the Internet.
In TCP/IP, there are only four layers:
- Application layer
- Transport layer
- Internet layer
- Network interface layer
The OSI model makes a clear distinction between layers and some concepts. In TCP/IP, it is harder to make this clear distinction and explain these concepts. Now you can see why I introduced to you the OSI model before the TCP/IP.
The TCP/IP application layer contains three layers from the OSI model:
- Application layer
- Presentation layer
- Session layer
The working of the application layer in the TCP/IP is a combination of these three layers from the OSI model. In this layer, we have various protocols such as HTTP, DNS, SMTP, FTP.
The transport and internet layers in TCP/IP work, as I described in the OSI model. But in the next section, I will be revealing more details on how the transport layer protocols (TCP and UDP) work.
The network interface layer in TCP/IP is a combination of two layers form the OSI model (data link and physical layer). I’m not going to go into the details of this layer. But in the OSI model, I described the critical functions of these two last layers. Here in TCP/IP, these functions are realized in one layer.
Protocols in the TCP/IP model
Internet layer protocols
ARP (Address Resolution Protocol)
Used to identify the MAC address. If the device knows the IP address of the target device, then ARP sends a request to all of the devices in the LAN to search for the MAC address of the device with the given IP. Then the device with that IP sends an ARP response with its MAC address.
This information will be saved in the ARP table. In windows or macOS, open terminal and type arp -a. Then you should see the ARP table.
In the image below, you can see how this process works when an ARP request matches the IP of the device.
The RARP protocol performs the reverse operation.
IP (Internet protocol)
I explained at the beginning of this article what IP is. But I want to make clear that the IP in the TCP/IP model is in the internet layer. It is also good to add that IP has two versions.
The second one has been introduced because IPv4 addresses are ending. IPv6 is more efficient, has better routing, and is safer.
ICMP (Internet Control Message Protocol)
This acts as a tool for solving problems. The ICMP reports any communication errors between hosts. ICMP messages can help to diagnose a problem. For example, if the router or host is overloaded, ICMP can send a message to slow down the transfer rate.
ICMP is used in the ping program, which allows the diagnosis of network connections. Ping lets you check if there is a connection between the hosts. It also allows you to measure the number of packets lost and delays in their transmission.
In the terminal, type ping www.scalac.io. After ping, you need to provide the host. You can choose any website. I’m going to check my connection with the scalac site. To exit ping, use CTRL + C.
Ping sends an ICMP packet to the host provided. In my case, I sent 17 packets and received back 17 packets. In this short connection, I didn’t lose any packets. The program also counts the time gap between sending and receiving packets. In the end, the program summarizes the connection and shows us the minimal/ average / maximum time gap between sending and receiving packets.
Transport layer protocols
TCP (Transmission Control Protocol)
TCP is a highly reliable and connection-oriented protocol. It applies the 3-way handshake principle. Before it sends any data, it will first establish a connection.
This rule consists of three steps, made to establish a connection.
- SYN – The device sends a message to the server, “I want to connect with you.”
- SYN / ACK – When the server receives the message, it will reply that it is ready for communication.
- ACK – The device sends confirmation of receiving the response from the server and that it is ready for communication.
The high reliability of TCP is due to the device, making sure that the data sent has been received by the server. Then the server makes sure that the data sent to you have been collected by you. If the server sends 10 data packets, and for some reason, you do not receive one of them, and you do not confirm the receipt, this server will try to send the lost package again.
TCP also provides data delivery in order. Each sent packet is numbered. Although packets may still arrive out of order, TCP will arrange them in order before sending them to the application.
To summarize the advantages of TCP:
- Set up a connection before sending any data
- Data delivery acknowledgment
- Retransmission of lost data
- Deliver data in order
UDP (User Datagram Protocol)
UDP sends data and doesn’t care if the device has received it or not. It also doesn’t care if some packets are lost. But the significant advantage of the User Datagram Protocol is that the packet sizes are smaller than TCP (about 60% lighter).
UDP is an economical version of TCP.
- Connectionless and unreliable.
- No data retransmission
- No data delivery acknowledgment
- Data can arrive out of order
You may ask the question, then why use UDP? It’s such an unreliable protocol!
In some cases, UDP is better because TCP has significant overheads (data retransmission, delivery acknowledgment, etc.) UDP is often used to transmit data in real-time: video streaming or audio such as Skype calls.
Application layer protocols
Network management protocols
DNS (Domain Name Services) – Changes the domain name to an IP address. The domain name is used because it’s human-friendly. It’s easier to remember a domain name (www.google.com) than an IP address (126.96.36.199). When you type any website address into a browser, then the browser sends a request to the DNS for the IP address of that domain.
If you type into a browser IP 188.8.131.52, then you should see the Google page because this is Google’s IP address. I can get it directly by requesting the DNS in the terminal. Type in terminal: nslookup www.google.com.
NTP (Network Time Protocol) – This is an uncomplicated and straightforward protocol. It is used for automatic time synchronization in devices connected to a network. Imagine now manually synchronizing time for 10 or 50 devices. This would be ineffective.
Some devices, procedures, or safety mechanisms require accurate time synchronization for proper operation. Also, thanks to the NTP, finding the causes of any network or device errors is easier. Because using the logs, we will be able to find out what the order of events was that caused the failures.
SNMP (Simple Network Management Protocol) – This is used for monitoring, management of updates, and diagnostics of networks and network devices.
Remote authentication protocols
SSH (Secure Shell) – This allows you to remotely log in to the terminal in network devices and administer them (e.,g. router, firewall, a remote server). SSH is secure because communication is encrypted. SSH uses the TCP protocol.
File transfer protocols
FTP (File Transfer Protocol) – The purpose of this protocol is to display a list of files/folders, adding, deleting, or downloading them from the server. A good example is sending website files to a server. To do this, you need to use an FTP client with which you can authenticate yourself and get access to the FTP server. A popular FTP client is FileZilla. FTP uses TCP.
A significant flaw of FTP is the lack of data encryption. Therefore, to ensure secure authentication and transfer of files, it is worth using FTPs (FTP Secure and FTP-SSL) or SFTP (SSH File Transfer Protocol). They work in the same way as FTP but extend its functionality by encrypting the transmitted data.
SMTP (Simple Mail Transfer Protocol) and IMAP (Internet Access Message Protocol) are two protocols used in sending and receiving emails. SMTP’s task is to send email messages from a client to an email server or between email servers. IMAP is used to manage and retrieve email messages from an email server.
- In the beginning, the email message is sent to the sender’s email server (Gmail)
- Then the Gmail email server sends an email message to the recipient’s email server (WP)
- Finally, IMAP retrieves the email message from the wp email server to our client.
When the sender and recipient have the same email service provider (Gmail), step 2 will be skipped.
HTTP/HTTPS – I have written a separate article on HTTP. You can read it here: What is HTTP protocol – introduction to HTTP for Testers. I explain there exactly how HTTP works. HTTPS extends HTTP functionality with data encryption protocols.
VoIP protocols (Voice over IP)
SIP (Session Initiation Protocol) – This performs an administrative function (using TCP). It is used only to set and close an audio or video connection.
RTP (Real-Time Transport Protocol) – This is used to transfer data during audio or video calls (using UDP).
For example, let’s say you want to call someone on Skype. SIP will be used to establish the connection. When the connection is established, the RTP springs into action and transmits the data. When you end the conversation, SIP will close the connection.
You have come to know many new concepts today. You now know how data flows in networks. They go through a rather complicated process. All of the topics I have touched are so extensive; they could easily have a separate article for themselves. However, I have tried to present them to you at a fairly general level, easy to understand. Without going too deeply into the more technical aspects.
If you think I have managed to explain things understandably and interestingly, please share this article on social media. And if you have any questions, also feel free to ask them in the comments below.