NginX and Riak

默北 Nginx NoSQL 分布式 数据库217,1831字数 7345阅读24分29秒阅读模式

Problem of storage and delivering static content is quiet actual nowadays. Lots of people needs big and reliable storages for storing static images and many other static files and delivering it to end users. Most popular solution still is NFS mounted storage, which is accessible from all front-ends, but this solution has big bottlenecks.

  1. Hard to backup.
  2. Everything relies on NAS.
  3. Statically mounted external storage is needed.

Now lets dig deeper: 文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

Hard to Backup :Some of you will say that this is not so ! But lets imagine that you have 10TB of small images which your application regularly use and this images are very critical. Standard rsync and or tar could take lots of time and system resources, which is definitely not what we want.文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

Everything relies on NAS : So what ? We an buy a reliable NAS/SAN with cool RAID(1-10) storage and use it. But if we have a closer look we will see that for having 10TB space with for example 10x1TB 15k RPM SAS drives we will need at least 11x drives + RAID controller. Everything is good so far, but wait what is the price for that. After digging internet shops and price-lists you will see that this is quiet expensive, especial if your data is very critical and you need hot-backup aka second NAS/SAN. Another bottleneck is that in that in this solution you will have to do vertical only scalability. This is expensive and hard to achieve. And at least by order but not by meaning is that you will have to share same IO  device for all. This is truly a problem for large scale deployments.文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

Statically mounted external storage is needed: This means that all your system will rely on externally mounted device and regardless how reliable is that, it is some king of SPOF.文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

So combining this all will show that classical shared storage architecture is hard to implement, expensive and has slow performance for large deployments. This may not me a big deal if, you are IT of Bank, and you management has lots of money and very little “imagination”. In this case this article is not for you MapReduce文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

So for everyone else: 文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

lets summarize what we need:文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

  1. Reliable storage.
  2. Low latency to access file.
  3. Easy management and backup.
  4. Reliability and fault tolerance.
  5. Easy access and less programming overhead.

After spending lots of time for finding a solution for mentioned problems we found seems ideal solution:文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

  1. Riak (Will act as storage and deliver files )Four our needs free, community edition is much more than enough
  2. NginX (Will act as reverse proxy and URL filter)

Before starting let’s summarize what these two tools will give us:文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

Riak: Wonderful, fully clusterized NoSQL server written in Erlang. It works asynchronously, has great performance and easy access via REST, protobuf and lots of other interfaces. It also has built in realtime Search index and MapReduce Implementation. But for now we will use only small par of Riak, aka storage for static files. In this scenario we must look  on several benefits against shared storage solution.文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

  1. Low latency to access files. (Riak bitcask uses Single Seek to Retrieve any value )
  2. Horizontally scaleable. (Just add more and more cheap servers to the cluster)
  3. Much more throughput (for example 10 servers with 1xGbit por will have total 10 Gbit minus about 10% internal utilization)
  4. No need for expensive Raids, SAN etc

So lets start my favorite part: Installation and configuration of mentioned above. As I’m Debian fan, I will do this on current Stable release Debian 6.0 Squeeze文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

First you need t download and install Riak. At the moment of writing this article this was the latest version of Riak but before just copy-pasting check out for latest version here: http://basho.com/resources/downloads/.文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

Download and Install Riak:文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

# cd /usr/local/src
# wget http://s3.amazonaws.com/downloads.basho.com/riak/CURRENT/debian/6/riak_1.2.1-1_amd64.deb
# dpkg -i riak_1.2.1-1_amd64.deb

 Done! Riak is installed. Do not start it for now. Just in case:文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

# /etc/init.d/riak restart

 Now we need to clusterize it and make some configuration changes. By default Riak binds on 127.0.0.1 whic ix not a good idea fo clusters MapReduce so change it to internal ip address of server,do not bind Riak on servers public IP is that exist . 文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

edit /etc/riak/app.config and change:文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

{pb_ip,   "192.168.235.111" }, and {http, [ {"192.168.235.111", 8098 } ]},

Also make sure you have configured /etc/hosts file and system hostname. Correct /etc/hosts should look something like this:文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

Also make sure you have configured /etc/hosts file and system hostname. Correct /etc/hosts should look something like this:文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

127.0.0.1 localhost
192.168.235.111 riak1.your-domain.com riak1

Also if you do not have your own internal DNS, you will have to add other nodes to /etc/hosts as well, but better to have DNS.文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

192.168.235.112 riak2.your-domain.com riak2
192.168.235.113 riak3.your-domain.com riak3
192.168.235.11N riakN.your-domain.com riakN

Also make some changes for storage configuration:文章源自运维生存时间-https://www.ttlsa.com/database/nginx-and-riak/

format and mount your bid disk to /var/lib/riak:

# mkfs.xfs /dev/sdb1
# mount /dev/sdb1 /mnt
# mv /var/lib/riak/* /mnt/
# umount /mnt
# mount /dev/sdb1 /var/lib/riak
# chown -R riak.riak /var/lib/riak

Or better just create another mount-point and reconfigure Riak to use it

# mount /dev/sdb1 /opt
# mkdir /opt/riak
# chown riak.riak /opt/riak
# mv /var/lib/riak/* /opt/riak

Change paths in /etc/riak/app.config:

{riak_core, [
{ring_state_dir, "/opt/riak/riak/ring"},
...--------...
{bitcask, [{data_root, "/opt/riak/bitcask"} ]},
{eleveldb, [{data_root, "/opt/riak/leveldb"}]},

Also it would be nice to enable Riak console to have nice WUI

It is installed by default so all you need is to change userlist from {userlist, [{"user", "pass"} to actual values. and make sure {admin, true} exist.

It is also highly recommended to enable HTTPS and user secure link to admin Riak. For that you will need to enable HTTPS at riak_core section and add Certificate and Private key files:

{https, [ {"192.168.235.111", 8069 } ]},
    {ssl, [
        {certfile, "/etc/riak/ssl/riak.crt"},
        {keyfile, "/etc/riak/ssl/riak.pem"}
]},

That's all ! now restart Riak and login to Riak Admin via https://192.168..233.111:8069/admin.

also edit /etc.riak/vm.args and change

Now restart Riak:

-name riak@127.0.0.1
to
-name riak@riak1.your-domain.com
/etc/init.d/riak restart

Nor we need to more nodes to have redundancy: lets imagine that we have 3 nodes cluster for now.

So repeat all steps above on riak2.your-domain.com and riak3.your-domain.com:

Now you have 3 separate nodes, now need to join them all to single cluster: Very nice guide to do this is here

Shortly you need yo do following: After making apropriate configs

on node 2 and 3

riak-admin cluster join riak@riak1.your-domain.com

And only on riak1 node

riak-admin cluster plan
riak-admin cluster commit

Now you have fully clusterized and working Riak installation.

to test is do following:

curl -v -X PUT http://192.168.235.111:8098/riak/images/foo.jpg -H "Content-type: image/jpg" --data-binary @./foo.jpg

Full reference to Riak cUrl commands is here

Now open http://192.168.235.111:8098/riak/images/foo.jpg with your favorite browser: Just for Fun open all 3 nodes and see same picture:

Great we have completed with Riak installation, now  lets install and configure NginX front-end.

apt-get install nginx

Now edit /etc/nginx/sites-enabled/default and replace content with this:

upstream riak {
    server 192.168.235.111:8098 fail_timeout=30s;
    server 192.168.235.112:8098 fail_timeout=30s;
    server 192.168.235.113:8098 fail_timeout=30s;
}

server {
        listen 80;
        server_name  your.public.domain;

        if ( $uri !~ \. ) { return 403; }       # Require URI with file extension 
        if ( $uri !~ ^/.*/.* ) { return 403; }  # Disable access to Riak / 
        if ( $uri ~ ^/.*/.*/.* ) { return 403;} # Disable Link walk MR etc  

        location / {
        if ($request_method = GET){
        proxy_pass http://riak;
        rewrite ^/(.*) /riak/$1 break;          # Remove /riak from external URL (Hide Riak)
        }

        proxy_redirect          off;
        proxy_next_upstream     error timeout invalid_header http_500;
        proxy_connect_timeout   2;
        proxy_set_header        Host            $host;
        proxy_set_header        X-Real-IP       $remote_addr;
        proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header        Referer         ""; # Zero up referer or Riak will 403 all requests 
        proxy_hide_header       X-Riak-Vclock;      # Remove Riak specific headers
        proxy_hide_header       Link;               # Remove Riak specific headers 
        proxy_hide_header       ETag;               # Remove Another Riak header 
        proxy_hide_header       Server;
        }
        }

After all this done you can get you image via http://your.public.domain/images/foo.jpg without any Riak specific tags and links.

原文地址: http://netangels.net/knowledge-base/nginx-and-riak/#.Uo3xN9JkP-Z

weinxin
我的微信
微信公众号
扫一扫关注运维生存时间公众号,获取最新技术文章~
默北
  • 本文由 发表于 23/11/2013 03:00:51
  • 转载请务必保留本文链接:https://www.ttlsa.com/database/nginx-and-riak/
评论  2  访客  2
    • ng
      ng 9

      仰慕很久的方式!

        • 默北
          默北

          @ ng comment” />

      评论已关闭!