Nginx的DNS解析过程分析

Nginx怎么做域名解析?怎么在你自己开发的模块里面使用Nginx提供的方法解析域名?它内部实现是什么样的?

本文以Nginx 1.5.1为例,从nginx_mail_smtp模块如何进行域名解析出发,分析Nginx进行域名解析的过程。为了简化流程,突出重点,在示例代码中省掉了一些异常部分的处理,比如内存分配失败等。DNS查询分为两种:根据域名查询地址和根据地址查询域名,在代码结构上这两种方式非常相似,这里只介绍根据域名查询地址这一种方式。本文将从以下几个方面进行介绍:

  1. 域名查询的函数接口介绍
  2. 域名解析流程分析
  3. 查询场景分析及实现介绍

一、域名查询的函数接口介绍

在使用同步IO的情况下,调用gethostbyname()或者gethostbyname_r()就可以根据域名查询到对应的IP地址, 但因为可能会通过网络进行远程查询,所以需要的时间比较长。

为了不阻塞当前线程,Nginx采用了异步的方式进行域名查询。整个查询过程主要分为三个步骤, Continue reading “Nginx的DNS解析过程分析”

通过流的方式处理文件压缩,加解密,签名

问题背景介绍

最近的项目需要进行很多的文件处理,因此就有了大量的IO操作。有的地方是先解密,再加密,有的是压缩,加密,再签名,最主要的是所有的非加密文件都需要安全删除,先填充一遍0,再把文件删除。

初始解决方案和问题

开始时我们使用文件来存储处理过程中的临时数据,以文件更换密码为例,需要进行如下处理:

  1. 解密原来的加密文件,写到一个临时文件
  2. 读取解密的临时文件,加密写到最终文件
  3. 将临时文件填充0,并删除

示例代码如下:

FileEncryptor.decrypt(originalEncryptedFile, tempFile);
FileEncryptor.encrypt(tempFile, resultEncryptedFile);
FileEraser.safeErase(tempFile);

这个过程中的IO操作如下图所示: Continue reading “通过流的方式处理文件压缩,加解密,签名”

Linux下打开文件后没有关闭的后果分析

这两天测试文件操作的性能,发现了有的地方打开文件后没有关闭。不关闭文件的后果是比较严重的,尤其是对服务器端程序来说更为严重。那这样会有什么问题呢?
1.不能再打开新文件.
打开大量文件并且不关闭, 很快会达到进程最大允许打开的文件数限制,这样就不能再打开文件。
在Linux上,可以通过ulimit -n 来查看和更改当前session的限制数,比如在我的机器上是:

$ ulimit -n
7168
$ ulimit -n 10000
10000

也可以通过修改/etc/security/limits.conf来永久性的修改限制数

2. 硬盘空间被占满。
如果文件被打开后,再被删除,在文件不被关闭的情况下, Continue reading “Linux下打开文件后没有关闭的后果分析”

Development notes for socks5 protocol

The RFC for socks5 protocol: http://www.ietf.org/rfc/rfc1928.txt

Here’s the procedure of CONNECT command for TCP based connection:

   Client   |   Server
----------------------------------------------------
1. Client init connection
2. Client send initial auth method selection message
             3. Server reply the selected auth method

4. [ Authenticate based on selected auth method ] optional, 
   not required when auth method is 0x00: No Auth

5. Client send request for destination address
             6. Server connect to the destination address
             7. Server reply with the bound address and port 
                of the connected target address

8. Begin transfer data between client and destination  
9. Close connection after transfer finished

Notes:

  • The procedure applies to a connection, e.g. each http request opens a new connection, so each request will go through the procedure described above
  • The destination address maybe an IP v4/v6 address or a domain name, when it’s a domain name, it still could be an ip address in dotted format. Here’s a real example:
    www.youku.com is using the CDN service provided by http://www.chinacache.com, so different visitor will visit different server based on the location. I found that the browser (at least chrome) will put ip address like 65.255.34.6 in the domain name field, and mark the address as a domain name
  • When server send response to client, it should send the port number that the server assigned to connect the target address. So we can only send the response after connected to the upstream

Notes for playing with ptrace on 64 bits Ubuntu 12.10

This blog is the notes during I learning the “Playing with ptrace”(http://www.linuxjournal.com/article/6100).

The original examples was using 32 bits machine, which doesn’t work on my 64 bits Ubuntu 12.10.

Let’s start from the first ptrace example:

#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <linux/user.h>   /* For constants
                                   ORIG_EAX etc */
int main()
{   pid_t child;
    long orig_eax;
    child = fork();
    if(child == 0) {
        ptrace(PTRACE_TRACEME, 0, NULL, NULL);
        execl("/bin/ls", "ls", NULL);
    }
    else {
        wait(NULL);
        orig_eax = ptrace(PTRACE_PEEKUSER,
                          child, 4 * ORIG_EAX,
                          NULL);
        printf("The child made a "
               "system call %ldn", orig_eax);
        ptrace(PTRACE_CONT, child, NULL, NULL);
    }
    return 0;
}

The compiler shows the following error:

fatal error: 'linux/user.h' file not found
#include <linux/user.h>

Something need to change because of:

  1. The ‘linux/user.h’ no longer exists
  2. The 64 bits register is R*X, so EAX changed to RAX

There are two solutions to fix this: Continue reading “Notes for playing with ptrace on 64 bits Ubuntu 12.10”

Understand the compile time operator: sizeof

During I read the source code of Redis, I found the following code:

dict *d = zmalloc(sizeof(*d));

After searching a different definition of ‘d’, I realized that the d is the same object defined in the same line. In order to understand what happens, Continue reading “Understand the compile time operator: sizeof”

HOW TO: Create ssh tunnel at boot time under Ubuntu

Create ssh tunnel

The simplest command to create a ssh tunnel is:

#The following command will create a sock5 proxy on port 7070, and then you can use it in your browser
ssh -ND 7070 HOSTNAME

Use Autossh instead of ssh

I prefer to use autossh instead of ssh because it will auto reconnect if the connection lost, Continue reading “HOW TO: Create ssh tunnel at boot time under Ubuntu”

在集成测试环境中, 根据历史测试结果自动进行测试分配

在我目前和以前的项目中,都遇到了测试运行时间太长,需要把测试分配到多台机器上并行运行的问题。有一些项目在手工的进行测试分配,而本文将介绍如何自动的进行测试分配,从而减少花在这些琐事上的时间。

最开始的问题是什么

随着项目的不断进行,测试会越来越多,而跑一次测试的时间也越来越长,尤其是如果界面测试多的话,那么测试运行时间很容易比较长。以我目前所在的项目为例,基于浏览器的界面测试(以下称为Functional Test)如果单机跑的话大约要80分钟。这样开发人员提交一次代码需要很长时间才能从CI Server上得到结果,而CI反馈周期过长会很大的降低开发人员对CI的关注度,即使仍然关注也会因为不断的切换上下文而降低工作效率。

初始的解决方案

开始时大家会用各种简单的办法来解决这个问题,比如通过tag或annotation(不同的测试框架有不同的概念)手工的把测试分成几部分,然后增加Build agent的数量,让每个Build agent 只跑一部分测试,理想的结果是每个Build agent的测试时间一样长,这样运行测试的时间在理想情况下就是:总的测试时间 / Build agent的数量

但这样同时也带来了新的问题:当增加测试时,设置或修改哪个测试在哪台机器上跑的工作是很琐碎的,我们需要经常关注每个Build agent当前的运行时间是多少, 新的测试加到哪个上面比较合适。我在工作中就经常听到有人说,现在测试太慢了,我们再加一个Build agent, 然后分点测试过去吧。简单重复劳动做的多了,就说明需要关注一下,看看有没有好的办法来解决。 Continue reading “在集成测试环境中, 根据历史测试结果自动进行测试分配”

Using multi-configuration project for distributed builds on Jenkins

Sometimes we need to run tests on different machines because of kinds of reasons:

  1. run tests on different environment, e.g. run tests against different OS, DB
  2. distributed test in order to get fast feedback, e.g. split tests into n parts, and only run one part on one machine

The following is a real screenshot of Jenkins in one of my projects, we created several jobs for the same functional tests, and the only difference between them is what test tag(s) need to run on this job.

The Problem

The settings for each job are almost same: Continue reading “Using multi-configuration project for distributed builds on Jenkins”