Tahoe-LAFS integration idea

Ask all your questions regarding OC 5.x Please read the Support Forum Rules
Forum rules
ownCloud 5.x reached end of life and is officially unsupported. For details see Wiki page.

Please upgrade your ownCloud.
anaqreon
Beginner
Posts: 36
Joined: Mon Mar 26, 2012 5:15 pm

Tahoe-LAFS integration idea

Postby anaqreon » Thu Sep 05, 2013 7:38 pm

What I think is a really promising idea is to combine Tahoe-LAFS with ownCloud. OwnCloud already has support for external storage systems (Dropbox, SFTP, Amazon S3, etc). And since Tahoe is designed with a webAPI precisely to interface with external apps like ownCloud, the combo would be powerful. If you had a bunch of people, each running a server with ownCloud and Tahoe, then they would form a grid where everyone's data is stored RAID-style with everyone else's. Each person's independent ownCloud instance would be creating their own files with access only to that instance's users. This is possible because Tahoe creates unique URIs for each file/folder added to the grid, each completely independent of the others. And there is a read-write and a read-only URI for each of them. This means that if you upload a file but forget the URI, it's effectively lost and you can never retrieve that data. This is why it occasionally goes through and deletes forgotten stuff. Anyway, the beauty of it is that it means when users on the grid want to share files/folders, all their ownCloud instances will do is make the relevant URIs available to the ownCloud instances of those receiving the shared files. Since the user probably already has part of the data stored on their local machine (it's all one big shared, RAIDed pot), the download time would on average be very fast. Plus, the system works very similar to BitTorrent, in that each node contributes pieces of the file which speeds up downloads. It doesn't matter if someone is not a member of the grid, either, you can still share files with them by sending them a standard link.

tilllt
Starter
Posts: 66
Joined: Fri Jul 06, 2012 9:59 am

Re: Tahoe-LAFS integration idea

Postby tilllt » Mon Sep 09, 2013 3:57 pm

HI, i agree.

TahoeLAFS integration would be great.

It has been discussed, both here and on the Tahoe Mailing List before and basically it is missing someone who could start the coding. I think there are some pitfalls though. It has been expressed explicitly by the Tahoe Developers, that for them "Security trumps Usability". From the little insight i have, OwnCloud is and cannot ever be a super secure piece of software, because it relies on too many third party libraries etc which probably could never be properly security audited. So integrating a super secure file storage into a not-so secure "cloud" software could compromise the security already.

I guess a rather simple task (for someone who knows coding) would be to integrate Tahoe as an external filesystem. A more complex thing would be to have a Tahoe Management "Plugin" which lets you quickly set up friend-nets etc.

Again, these guys take security very serious, so even the Tahoe setup process can be quite a hassle, depending on your OS. Understandably they are not entirely comfortable using precompiled installation packages, as those can can manipulated by an interested party (i.e. the NSA) to introduce security flaws. So i think the Owncloud-Tahoe integration could be really cool and everything but probably it would have to be done without the blessing of the tahoe developers as they would find too many possible security issues introduced by that combination. But maybe i am also wrong about that, my insight on crypto stuff is very limited.

cheers,
t.

anaqreon
Beginner
Posts: 36
Joined: Mon Mar 26, 2012 5:15 pm

Re: Tahoe-LAFS integration idea

Postby anaqreon » Tue Sep 10, 2013 3:43 am

I appreciate your thoughts. Luckily my vision of the integration with Tahoe is not an all-or-nothing concept. If ownCloud has intrinsic security shortcomings, then that would help determine how much access to data on the grid the ownCloud instance would have. If I understand Tahoe enough, one great thing about it is the decoupling of the data access from the storage. For sensitive data, you could trust only particular apps running on particular hardware under your control. For less sensitive but still private data that you want to access, share and organize through the convenient interface of ownCloud, you can let a Tahoe client node controlled by your ownCloud instance access the grid, without worrying about it knowing anything about the other encrypted data stored there too.

A possible use for the coupling of the two is a built-in backup of the ownCloud instance itself. Using an independent client, you could automate the backup of ownCloud to the same grid it is using for content storage. In the event of server failure of migration, you could simply reinstantiate the ownCloud instance from another server running a Tahoe client.

srfreeman
ownCloud professional
Posts: 1625
Joined: Sun Apr 21, 2013 10:34 am
ownCloud version: 6.0.4
Webserver: Apache
Database: MySQL
OS: Linux
PHP version: 5.3.10
Location: US

Re: Tahoe-LAFS integration idea

Postby srfreeman » Tue Sep 10, 2013 1:24 pm

Hello all,

Distributed file systems and grid computing have been around for quite some time. Production use of such things has largely been discounted for several, well documented reasons.

When dealing with the cloud computing / storage issue, the need to know where all enterprise data is stored becomes a major problem. Distributed file systems and the "security is in not knowing where it is" mantra, does not properly address the issue.

The need for ever growing storage capacity for the enterprise is being answered with SAN and block storage technology, because the proper controls can be instituted there.

Automatic and nearly instantaneous failover of applications such as ownCloud can be easily attained through current cloud computing practices and infrastructure, without going outside the corporate firewall.

Integration of ownCloud into the corporate IT infrastructure, behind the firewall, with no connection to the public Internet, provides an excellent 'sync and share' solution for users - with no inherent security issues.

anaqreon
Beginner
Posts: 36
Joined: Mon Mar 26, 2012 5:15 pm

Re: Tahoe-LAFS integration idea

Postby anaqreon » Tue Sep 10, 2013 1:59 pm

I should have been clear about the target users I am considering; namely, individuals using ownCloud for personal data. I know very little about enterprise level infrastructure, so I appreciate your description of some of the components. One of the things I like most about ownCloud is that it is a powerful tool for regular people to use to take control over their data without sacrificing the benefits of the cloud. I am thinking of ways to enhance the capabilities of the personal cloud that involve the least amount of infrastructure and cost, in addition to technical savvy. Something like Tahoe and ownCloud could enable groups of friends and family to buy relatively simple pieces of hardware (think of the FreedomBox project) that would enable them to form data grids and webs of trust for sharing. I'm not claiming these are new ideas, but they are not yet realized on a mass scale.

One of the things the corporate controlled cloud (google, dropbox,etc) provides is constant access to everything, but individuals cannot attain such server uptimes or constant access in general. This is a major obstacle to putting the power of publishing and control of data in the hands of the individual. For example, you might want to publish a blog that you want people to have access to all the time, and you don't want to worry about what happens when your ISP changes your IP address or your hardware is shut off for a while because your power goes out due to a thunderstorm. You could imagine a system where your friend could reinstantiate your ownCloud instance on her network node if yours goes down and then return control to yours when it comes back online. This of course assumes some mechanism can reroute requests to your personal domain, but that is a separate issue that while difficult is by no means an insurmountable challenge.

srfreeman
ownCloud professional
Posts: 1625
Joined: Sun Apr 21, 2013 10:34 am
ownCloud version: 6.0.4
Webserver: Apache
Database: MySQL
OS: Linux
PHP version: 5.3.10
Location: US

Re: Tahoe-LAFS integration idea

Postby srfreeman » Tue Sep 10, 2013 2:27 pm

Hello anagreon,

For consumer grade usage, there are also several reasons "they are not yet realized on a mass scale" also.

Not only do you underestimate the "technical savvy" and hardware requirements necessary to accomplish your noted task, personal experience has shown that the general admin of the 'friend net' must be a very good salesman. Without a constant sales pitch, users will tire of the performance hit on their personal (decidedly low end) system and drop out of the net, the reduction of the size of the net soon kills the usefulness.

Constant problems caused by use of such a system on residential ISP accounts will also cause reduction of the size of the net, since ISP's do not often allow such usage on their residential networks.

anaqreon
Beginner
Posts: 36
Joined: Mon Mar 26, 2012 5:15 pm

Re: Tahoe-LAFS integration idea

Postby anaqreon » Tue Sep 10, 2013 2:43 pm

I hear what you're saying. I like to think I'm being more optimistic than I am purposefully underestimating the difficulty. Believe me, the amount of hassle I have gone through just to share photos and videos with my family through "user friendly" methods like ownCloud and BitTorrent sync, Friendica, etc instead of using the Facebook and Instagram or iCloud methods they like to use has been educational for properly appreciating the scope of the human obstacles to these ideas.


Your point about ISP restrictions is an important one. There is a certain level of legislation that might be required to ensure that individual freedom online is possible, but bypassing fundamental issues like that, I am thinking that these decentralized f2f type networks will rely on a certain number of dedicated server hubs. You could imagine a group of friends chipping in to pay for a decent server on Rackspace or whatever, and relying on one of them to properly administer it. The hub would provide a connection for those that cannot run full nodes on their own for whatever reason. As usual, they would be sacrificing security and control for convenience, but to a much lesser degree than we currently do, particularly in the wake of all the scandal surrounding the NSA revelations. I meant to mention earlier that Friendica's Red matrix project might be a promising step towards this kind of new web and might even be integrable in some form with these other tools in creating it.

srfreeman
ownCloud professional
Posts: 1625
Joined: Sun Apr 21, 2013 10:34 am
ownCloud version: 6.0.4
Webserver: Apache
Database: MySQL
OS: Linux
PHP version: 5.3.10
Location: US

Re: Tahoe-LAFS integration idea

Postby srfreeman » Tue Sep 10, 2013 3:40 pm

Hello anaqreon,

I do not mean to cool anyone's optimism. I would, very much, like this to be a discussion that led to a new way of using the public Internet and the inherently 'now wasted' computing resources available therein. Unfortunately, continual testing over the past 30 years (yep, I am old) has shown too many stumbling blocks, in both human and technical terms, exist in the distributed file system / grid computing path.

The dedicated server idea is a general progression of your idea (been there too) that leads to the increased "technical savvy" (and my noted underestimate statement), required to continue from there.

Things like BitTorrent and the decided misuse of such technology has left too bad a taste in the metaphorical mouth of infrastructure providers to now ask for legislation to allow it based on a 'personal freedom' basis.

Many folks seem concerned over the NSA debacle, yet remain oblivious to the 'Big Data' issues and how the fact that 'every' file, contained in 'every' device, connected to 'any' enterprise (ISP on up) is subject to use by the enterprise 'Data Engineering' teams, can affect them.

Many employees have been shocked (and worse) to find out that their incomprehensible misunderstanding of data security on devices presented for the BYOD program can lead to the employer knowing more about them than they imagined. Not to mention the fact that exposure of personal financial information to your ISP's employees and enterprise IT staff is not really a good idea.

Personal responsibility and personal understanding of privacy must be enhanced to a far greater level, before any 'technical' methods will be useful in the consumer arena.

srfreeman
ownCloud professional
Posts: 1625
Joined: Sun Apr 21, 2013 10:34 am
ownCloud version: 6.0.4
Webserver: Apache
Database: MySQL
OS: Linux
PHP version: 5.3.10
Location: US

Re: Tahoe-LAFS integration idea

Postby srfreeman » Tue Sep 10, 2013 4:27 pm

Hello anaqreon,

I just cannot resist expounding on the 'dedicated server' idea from memories of testing the ideas presented in this thread.

The general progression of the 'dedicated server' idea in a world where the group of users wants to continue and grow, leads to the need of your own, provider grade, data center. Then to multiple, fiber connected data centers. Then to the users moving in and living in the data centers.

Certainly, living in a fortress designed to withstand any natural disaster and most man made ones, while guarded by armed individuals with authorization to use deadly force, does lead to a very secure life style. The need to provide your own infrastructure for such things as education, medical care, nutrition, etc... can be rather daunting though.

tilllt
Starter
Posts: 66
Joined: Fri Jul 06, 2012 9:59 am

Re: Tahoe-LAFS integration idea

Postby tilllt » Tue Sep 10, 2013 9:14 pm

Sorry Guys,

i think the point got lost along the way of discussing a possible Tahoe-LAFS integration with owncloud. IMHO, Tahoe absolutely makes sense, in all of the mentioned environments. You can use Amazon EC2 storage and put your data on it, you can build a friendnet, or you can have your own storage grid in the server room of your office, or combine all of these. The real advantage of Tahoe is, that you simply dont have to care about the security the storage providers can offer, because they simply store cyphertext. It doesnt matter if this is a friend of yours or Amazon. Once you have a working storage grid, no matter HOW you achieved this, security from tahoe (storage management) part, can be assumed. For that reason, knowing what data exactly is stored where is not only not wanted but also not necessary. In Tahoe you define how much redunancy your data should have, i.e. you can define a minimum of 5 out of 10 availability for the storage nodes and it will distribute the data accordingly. so even when 5 nodes are not reachable, you can retrieve your data. if you have all 10 nodes, it will just be faster.

My main problem is exactly the Security vs. Usability issue. The URI's to files and directories are the keys to decrypt them. So optimally the should never be stored anywhere, where you cannot achieve 100% security. There is no "bookmarking" in the Tahoe WebUI, so you need to keep the Keys somewhere secure, i.e. on a USB drive you carry around. Once you make them easily accesible (i.e. stored in Owncloud), Ownclouds security flaws can compromise Tahoes security.

I personally would rather trust my own server, with my own owncloud installation and tahoe as a storage backend, then Dropbox with the NSA on their back, a completely intransparent legal base under which us secret services can force US companies to give out private data etc. So i would even risk storing decryption keys in owncloud, but i would BET with you the Tahoe guys would never approve. And they are right in a way, i am a noob when it comes to security and a pro hacker could probably open up my Raspberry pi in a couple of minutes. (although i do my best they cant)

So, to sum this up: In a world with perfect security, you cannot store any Tahoe URI in Owncloud, as owncloud is potentially unsecure. If you dont store the uri´s / keys in owncloud (or any other frontend) Tahoes usability is limited. And i guess thats the premise under which tahoe can and hopefully will be integrated in owncloud some day, but i guess people should be aware of that.


  • Similar Topics
    Replies
    Views
    Last post

Return to “ownCloud Community Edition 5.x and older”

Who is online

Users browsing this forum: Bing [Bot], Yahoo [Bot] and 2 guests