Post

Nginx from scratch part 1

Nginx from scratch part 1

What is Nginx?

Well Nginx is a lot of things, but it is very popular for being a load balancer Nginx is open source and also has Windows builds (but honestly who cares) it also has a javascipt module kind of thing called njx

if any of you who has never dealed with devops but have heard about nginx its probably because you had to run some php website, nginx is not just used for serving php web content, but it has various applications

It can be used as a

  • Web Server
  • load balancer
  • reverse proxy
  • cache server
  • Media server
  • API gateway
  • Dynamic content handling

Why Nginx over Apache

Well this is more of a personal opinion,firstly nginx is very very simple and I dont really like the apache config syntaxes they look like XML but no it’s not XML and it’s confusing sometimes Also it’s a bit old software and apache2 and apache have some very different syntaxes, it’s confusing to me, it’s good for php and other static content but configuring it for other things is a bit questionable in my opinion apparently nginx is a bit more secure than apache

Nginx Installation and Directory Structure

Install nginx with your favorite package manager

For ubuntu or debiain based distros

1
sudo apt install nginx

For arch based distros

1
sudo pacman -S nginx

Or use this if you want to install using the official nginx repository

Directory Structure Like most global configuration files nginx config are also present in the /etc directory.

After successful installation of nginx you can find nginx.conf in /etc/nginx/

The directory Structure might look something like this

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
/etc/nginx/                # Main configuration directory
    ├── nginx.conf         # Main Nginx configuration file
    ├── conf.d/           # Directory for additional configuration files
    ├── sites-available/   # Directory for available server block configurations
    ├── sites-enabled/     # Directory for enabled server block configurations
    ├── snippets/          # Directory for reusable configuration snippets
    └── mime.types         # File defining MIME types

/var/www/                  # Default web root directory
    └── html/              # Default directory for serving web content
        └── index.html     # Default index file

/usr/share/nginx/html/     # Alternative web root directory (depends on installation)

/usr/lib/nginx/            # Directory for Nginx modules and libraries

/var/log/nginx/            # Directory for Nginx log files
    ├── access.log         # Access log file
    └── error.log          # Error log file

/usr/bin/nginx             # Nginx binary executable

There maybe more files depending on how you have installed it but these are the main files
we are mainly concerned with nginx.conf

Nginx syntaxes

There are two main syntaxes in nginx

  • Directive : Directives are the individual instructions in the Nginx configuration file. Each directive typically consists of a name followed by one or more parameters. Directives can control various aspects of Nginx’s behavior, such as server settings, logging, and performance tuning, Each directive ends with a ‘;’.
    Some examples of nginx directives
    1
    2
    3
    4
    5
    
    listen 80;
    server_name example.com;
    root /var/www/html/;
    proxy_pass http://localhost:8080;
    index index.html;
    
  • Content : Contexts are blocks that group directives together. Each context defines a specific scope for the directives it contains.
    Examples of nginx contexts
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    
    http{
    # main context
      server{
        # server context
        #  a sub context of html context
        # a html context can have n number of server contexts (generally there are 2, one for http and one for https)
        location <route> {
            # location context
            # a sub context of server context
            # a server context can have n number of location contexts
        }
      }
    }
    events{
    # events context
    }
    

Nginx as a web server

  • What is a web server

    • In simple terms web servers are software that serve web content, like html, css files, images, etc. They communicate with http (Hyper Text Transfer Protocol)
    • yes, previewing a html file in a web browser and viewing a html file served from a web server are different, they may appear the same but the diffrence is http, you will understand the different when you try to access othe files using fetch or other js functions
    • web servers are bind to a port, generally 80 for http and 443 for https as they are the default ports for http and https respectively

By default installing nginx will give you a web server configuration, that is supposed to serve /var/www/html/index.html (depending on your distribution and installation method)
It has too many things that it might be very confusing and scary at first look, save that config somewhere else and delete all its content as you will its better to understand if you write from scratch

1
2
3
4
5
6
7
8
9
10
11
http{
  server{
    listen 80;

    location / {
      root /var/www/html;
    }
  }
}

events{}

To test your config run

1
sudo nginx -t

if successful you can start your nginx server with

1
sudo systemctl start nginx

nginx_default_index.html If you are getting error like 502 bad gateway, make sure you have index.html in the folder /var/www/html if its with a different name you can use the index directive to mention that file to be rendered on that route
I highly recommend not to use a different directory than the default ones because nginx may not have enough read permissions to read that file, and if you have your web files in /home/user you mihght have to give permissions to others users to access your files which might make your directory insecure, If you really insist on having your web files somewhere else you can make a symlink to /var/www/html/

For more information on web server, you can read the config at /etc/nginx/sites-enabled/default

The location context and root directive

  • serving files based on routes

    The location context takes the route as a value like / or /images /assets etc and renders files requested
    For example

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    
    http{
    server{
      listen 80;
    
      location / {
        root /var/www/html/;
      }
      location /posts {
        root /var/www/posts;
      }
    }
    }
    events{}
    
    • a request to / or /index.html will serve the file /var/www/html/index.html # if the file exits
    • a request to /hello.html will serve the file /var/www/html/hello.html # if the file exists
    • a request to /posts/ or /posts/index.html will serve the file /var/www/posts/index.html # if the file exists
    • a request to /posts/hello.png will serve the file /var/www/posts/hello.png # given the file exits
  • serving files based on file extension

    The location context can also be used to serve files based on their file type like /*.html can take files from /var/www/html and /*.png can take files from /var/www/images
    For Example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
http {
  server {
    listen 80;

    eror_page 404 /404.html;

    # Serve HTML files
    location ~* \.html$ {
      root /var/www/html;
    }

    # Serve PNG files
    location ~* \.png$ {
      root /var/www/images;
    }

    # serve 404 if files with other extensions were requested or a non existing file was requested
    location / {
      try_files $uri $uri/ =404;
    }

    location = /404.html {
      root /var/www/errors;  # Directory where 404.html is located
      internal;              # Prevent direct access to this page
  }
  }
}
events{}
  • The location context here takes a regular expression and matches the file name with it, it routes based on the file name
    • ~ is used to tell that the url is case sensitive , whereas ~* is used to tell url is case insensitive ie both index.html and Index.HTML will render the same file
    • . in regex means a single char
    • \. here is used to refer to a literal . it is escaped using the \ charecter , we are searching for a literal . as in .html etc
    • $ symbol tells that it is the end of the regular expression
  • So in this situation a request to / or ‘/index.html’ will render /var/www/html/index.html as index.html is the default index for any route unless explicitly mentioned
  • When a request is sent to route /nginx.png, the server will serve the file /var/www/images/nginx.png as the root of .png files is /var/www/images
  • a request to /nginx.json, will take you to /404.html route and render the file /var/www/errors/404.html since no location is mentioned for .json files

Nginx as a reverse proxy

  • A reverse proxy is something that hides the address of your actual server by staying in between the server and the client, it can make the service secure by restricting access, hiding the ip address of the service, filtering clients, rate limiting etc

configuring Nginx as a reverse proxy is relatively simple

1
2
3
4
5
6
7
8
9
10
http{
  server{
    listen 80;

    location / {
      proxy_pass http://localhost:8080;
    }
  }
}
events{}
  • Basically this allows you to access http service on port 8080 from port 80 which is generally done as port 80 is default port for http
  • when i say port 80 is the default port of http i mean to say that i dont have to type http://localhost:80 to access port 80 but i can just type locatlhost to access the same thing as the default protocol is http and port is port 80 similarly for https i dont have to type https://localhost:443 i can just type https://localhost it will by default take the port 443 and that is the reason why we dont search https://google.com:443

  • you might wanna mess around with different directives of a reverse proxy, believe me nginx has a really good documentation
  • few distributions come with a file /etc/nginx/proxy_params ith some default params like
    1
    2
    3
    4
    
    proxy_set_header Host $http_host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    

    you can include this file or any other nginx config using the include directive

    1
    
    include /etc/nginx/proxy_params;
    

Nginx as a load balancer

well a load balancer is something that distributes traffic among the servers reverse proxy config itself can act as a load balancer given that we provide a list of upstreams

The following config uses load balancing technique round robin

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
http{

  upstream backend{ # name of your upstream
    server backend1.com;
    server backend2.com;
    server backend3.com;
  }

  server{
    listen 80;

    location / {
      proxy_pass http://backend;
    }
  }
}

Problems with load balencing

There is no problem if it is just a static site, but it becomes a huge problem if client is in session with the server.
Lets try to understand this problem

  • say client Alice is connected to backend via load balancing
  • she has registered her account and made a login via backend1.com
  • she sends a next request lets say to go to the home page, the request is round robined and it might go to backend2.com but alice was authenticated by backend1.com and backend2.com doesnt know alice so it will ask alice to login again, this is happening because the session is not persistant

solutions

  • sticky sessions : in this approach, Alice will always be redirected to backend1.com
    • pro : easy to implement
    • con : uneven load balancing
  • session replication : In this approach, session data is replicated across all backend servers. This way, any server can handle requests from any user, as they all have access to the same session data.
    • pro : evenly balenced load
    • con : can be expensive as it needs to store session data in a a cache database like redis
  • Token based Authentication : Instead of maintaining session state on the server, you can use token-based authentication (like JWT - JSON Web Tokens). The server issues a token upon successful authentication, which the client includes in subsequent requests.
    • pro : no need to store session data over a server

Sticky sessions implementation in nginx

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
http{
    upstream backend {
        ip_hash;
        server backend1:3000;
        server backend2:3000;
        server backend3:3000;
        server backend4:3000;
    }

    server{
        listen 80;

        location / {
            proxy_pass http://backend/;
        }
    }
}
events {
}

belive me this annoyed me during my RTP in sem 4, like it was one line , one directive “ip_hash” that I was missing, it took me few days to understand the reason i was getting logged out, but it was worth it

This has become too long, Ill continue in part 2

This post is licensed under CC BY 4.0 by the author.