wiki:WikiStart
Last modified 3 years ago Last modified on 08/24/11 13:58:47

Welcome to the s3fs project!

What is s3fs

s3fs is a FUSE filesystem written in python backed by Amazons Simple Storage service. Amazon offers an open API to build applications on top of this service, which several companies have done, using a variety of interfaces (web, rsync, fuse, etc). None of these companies offer a fully open source solution to using s3. This project is an attempt to remedy that.

Some notes about naming There are lots of projects that are trying to make use of s3fs at the moment. As such (and this is likely not suprising), there are several projects that make use of the name s3fs, or variants thereof. Most notably, there is the s3fs-fuse project (http://code.google.com/p/s3fs/wiki/FuseOverAmazon), and the s3fs project (http://code.google.com/p/s3fs-fuse/). There are several others, all in various stages of production. Naturally, given the accuracy of s3fs as a project name, its hard to want to give it up. I apologize in advance for any confusion. In an effort to aliviate this, I'm mentioning here that the fuse-s3fs rpm undergoing review for inclusion in fedora is the rpm generated from the project at this site.

Current Status

_Very_ early in development. No docs exist as of yet, but the utility is capable of creating an amazon bucket, formatting it as an s3fs drive, and mounting it locally.
WARNING You should not yet store any data that you do not have otherwise backed up on s3fs! development on this filesystem is early enough that data loss/corruption may occur!!

Near Term Goals

  • Teach me more python :)
  • Ability to recover data from outside the FS (i.e. web based client interface) (DONE)
  • Implement setattr (scp to the fs won;t work until this is done I think)
  • Proper ownership checks for files/directories (SORT OF DONE)
  • Caching of files on local disk (DONE)
  • scalability to large directory structues (currently limited as dir/file structure is stored in a class pickle)

Long Term Goals

  • Ability to do multiple user mounts of the same bucket by implementing the MESI protocol via Amazons Simple Queue Service

Some Notes of file locking
Its becoming rather evident that, given Amazons data consistency model, that multi node mounting of S3 buckets will not be possible, at least not in any sort of efficient manner if locks are at all contested. That said, all is not lost. I think the most reasonable thing to do is to add an extra layer of software in place to provide locking in a reasonable manner to all nodes that share that mid-layer. In this case, that mid-layer is NFS. I'm currently working on local mount file locking for s3fs (i.e. any processes that share a local mount will be able to lock files from one another). By implementing this, multiple remote nodes will then be able to leverage file locking by the mechanisms already in place via NFS infrastructure. The model is as follows:

  • NFS server mounts an s3 bucket on /mnt/s3/bucket
  • NFS server exports /mnt/s3/bucket
  • NFS clientsmount nfs.server.com:/mnt/s3/bucket

Note that fuse doesn't support export_ops in the linux kernel, and as such the default NFS server won't work. You need to use a user space NFS server (n-4 for example should work just fine)

This is a good solution, however, I'll be working on implementing a more directed solution that provides a locking solution that allows for multi-client use of s3fs eventually. Both solutions should have long term viability.

News

March 7 2009

  • Started looking into integrating s3fs with jgarziks coarse locking daemon for multiuser operation

May 16 2008

  • New version 0.5 released!

Apr 25 2008

  • Received some email with some interest from Ubuntu. Looks like there may be .deb packages of s3fs available soon :)
  • Started using the bug tracker for bugs here. Check it out if you want to submit a non-distro specific bug (https://fedorahosted.org/s3fs/newticket)

Apr 14 2008

  • I've begun thinking about how to do state sharing between s3fs nodes. Currently user space nfs servers arethe only ay to do multi node s3 mounts. I think this might be an opportunity to take advantage of sctp. To that end I've submitted python-pysctp (the python sctp bindings) to fedora for inclusion
  • Having learned about python module packaging, I think it would be worth doing a bit of factoring on s3fs itself as well

Apr 9 2008

  • I've begun work on local file locking, check the locking branch of git

Mar 24 2008

  • s3fs has been accepted into extras and will be pushed shortly for F-8 & rawhide
  • once its pushed and available for those two I'll branch it for EL-5

Mar 20 2008

  • Still working through the fedora review
  • Working on getting extended attributes working

Mar 12 2008

  • Fedora review is ongoing. Package has been renamed in fedora to fuse-s3fs
  • Lots of man page updates

Feb 29 2008

Feb 26 2008

  • Implemented chown/chmod/link/symlink

Feb 25 2008

  • Implemented writeback_time option
  • Implemented SHA1 fingerprinting
  • Implemented cache preservation
  • Implemented mkdir
  • Improved unmounting syncronization

Feb 22 2008

  • Writes from Cache to S3 are working!
  • Re-pickling of FS data working!
  • This means your data shows up if you do an unmount/mount!
  • Reads back from S3 are working!
  • New mount option format to take advantage of python fuse parser
  • Cache persistence seems to be working!
  • Need to implement sha1 fingerprinting to detect stale cache elements or some such

Feb 14 2008

  • File Reads seem to be working
  • Still lots of work to do for S3 backround I/O
  • File permissions in cache need work

Feb 12 2008

  • File Writes to cache seem to be working!
  • Ironically, file reads, no so much :)

Feb 08 2008:

  • Simple file creation seems to be working!
  • we now have man pages included in the project

Releases

Releases are found here: https://fedorahosted.org/releases/s/3/s3fs/

v0.5
Lots of bug fixes, and bucket locking to prevent accidental reformatting or deleting of buckets

v0.4
Base release. The filesystem can read/write and delete files. Be careful with your data, but you could actually use this thing right now. :)

Git

git clone git://git.fedorahosted.org/s3fs <local dir>

Browse the repository at http://git.fedorahosted.org/git/s3fs?p=s3fs.git;a=summary

Participation

No list or anything atm, but feel free to contact me directly at nhorman@… or open a ticket requesting enhancement.