介绍 – 浅谈分布式文件系统MogileFS(1)

mogilefs曾经带我入门进入到分布式文件系统的领域,既然ttlsa上讲了gearman,也讲下mogilefs吧,都是一个人开发的,mogilefs巧妙的用http put实现了一个分布式服务器,适用于存储小文件

方法论

认识一个系统,我觉得我的步骤是如下的:

  • 这个系统是干什么的,是为了解决什么问题而存在;
  • 这个系统是长什么样子;
  • 如何跟这个系统进行对话。

于是本文从MogileFS的来源谈起,继而勾画MogileFS的架构,介绍MogileFS的基本使用方法,最后介绍了MogileFS的管理。

文后介绍了如何将MogileFS纳入第三方应用的方法。

mind map

gearman

mogilefs-01

 

背景

MogileFS由Danga Interactive 公司开发出来的分布式文件系统,为解决当时所运营的LiveJournal站点的存储难题而产生。

在此之前该技术团队已经采取了数据库分区等技术,这意味着MogileFS中也包含着分而治之的思想。当前MogileFS已经广泛应用于一些高性能的web2.0网站之中,最典型的是Instagram使用它作为图片存储集群。

术语及解释

了解在MogileFS中出现的术语,对于掌握MogileFS的架构至关重要

术语 解释
application thing that wants to store/load files
database the database that stores the MogileFS metadata (the namespace, and which files are where). This should be setup in a HA config so you don't have a single point of failure.
tracker event-based parent process/message bus that manages all client communication from applications (requesting operations to be performed), including load balancing those requests onto "query workers", and handles all communication between mogilefsd child processes.
storage node where files are stored. The storage nodes are just HTTP servers that do DELETE, PUT, etc. Any WebDAV server is fine, but mogstored is recommended. mogilefsd can be configured to use two servers on different ports... mogstored for all DAV operations (and sideband monitoring), and your fast/light HTTP server of choice for GET operations. Typically people have one fat SATA disk per mountpoint, each mounted at /var/mogdata/devNN.
domain A domain is the top level separation of files. File keys are unique within domains. A domain consists of a set of classes that define the files within the domain. Examples of domains: fotobilder, livejournal.
class Every file is part of exactly one class. A class is part of exactly one domain. A class, in effect, specifies the minimum replica count of a file. Examples of classes: userpicture, userbackup, phonepost. Classes may have extra replication policies defined.
minimum replica count (mindevcount) This is a property of a class. This defines how many times the files in that class need to be replicated onto different devices in order to ensure redundancy among the data and prevent loss.
key A key is a unique textual string that identifies a file. Keys are unique within domains. Examples of keys: userpicture:34:39, phonepost:93:3834, userbackup:15. Fake structures work too: /pics/hello.png, any string.
file A file is a defined collection of bits uploaded to MogileFS to store. Files are replicated according to their minimum replica count. Each file has a key, is a part of one class, and is located in one domain. Files are the things that MogileFS stores for you.
fid A fid is an internal numerical representation of a file. Every file is assigned a unique fid. If a file is overwritten, it is given a new fid. 

感谢“li wenbo”的投稿,在此表示很抱歉,当前文章很早就投稿了,由于工作的疏忽没留意。

请看下一篇:安装配置 - 浅谈分布式文件系统MogileFS(2)

发表评论

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: