Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Contents may settle during shipment.


rocksolid / rocksolid.shared.security / Anyone has stuff on tumblr ?

Anyone has stuff on tumblr ?

<346382d951d142ad9579f7ab132c0d0d@def4>

  copy mid

https://www.novabbs.com/rocksolid/article-flat.php?id=35&group=rocksolid.shared.security#35

  copy link   Newsgroups: rocksolid.shared.security
Path: i2pn2.org!rocksolid2!def5!POSTED.localhost!not-for-mail
From: 3424234...@anon.com (3424234234)
Newsgroups: rocksolid.shared.security
Message-ID: <346382d951d142ad9579f7ab132c0d0d@def4>
Subject: Anyone has stuff on tumblr ?
Date: Mon, 18 Mar 2019 22:01:51+0000
Organization: def5
In-Reply-To:
References:
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
 by: 3424234234 - Mon, 18 Mar 2019 22:01 UTC

lmao

https://news.ycombinator.com/item?id=19418165

Hacker News new | past | comments | ask | show | jobs | submit login

zxcvbn4038 18 hours ago | parent | favorite | on: Myspace lost all the music its users uploaded betw...

I used to work at Tumblr, the entirety of their user content is stored in a single multi-petabyte AWS S3 bucket, in a single AWS account, no backup, no MFA delete, no object versioning. It is all one fat finger away from oblivion.


leowoo91 8 hours ago

I guess your statement is a bit beyond NDA, but thank you for sharing.

reply

dev_dull 5 hours ago

Borderline whistleblowing.

reply

SmellyGeekBoy 5 hours ago

It's not covered by NDA if it's made up.

reply

johnvanommen 5 hours ago

> I used to work at Tumblr, the entirety of their user content is stored in a single multi-petabyte AWS S3 bucket, in a single AWS account, no backup, no MFA delete, no object versioning. It is all one fat finger away from oblivion.

Remember when Microsoft lost all of the data for their Sidekick users? Basically they were upgrading their SAN and things went badly.

reply

ummonk 18 hours ago

What the hell. It is so easy to configure multi-region glacier backups, mfa delete, etc. for a single S3 bucket. Took me like a couple hours to setup versioning and backups, and a few days to setup mfa for admin actions. Why would they not set this stuff up?

reply

gregrata 17 hours ago

The key words you probably need to look at are "multi-petabyte". Not saying they shouldn't be doing something but it all costs - and at multi-petabytes, it cooooosts

1 Petabyte (and they have multiple) S3 - $30,000 a month, $360,000 a year

S3 - reduced redundancy - $24,000 a month, $288,000 a year

S3 - infrequent access - $13,100 a month, $157,000 a year

Glacier - $7340 a month - $88,000 a year

reply

zxcvbn4038 14 hours ago

Add in transit and cdn and Tumblr’s AWS bill was seven figures a month. A bunch of us wanted to build something like Facebook’s haystack do away with S3 altogether, but the idea kept getting killed because of concerns over all the places the S3 URLs were hard coded and also breaking 3rd party links to content in the bucket (for years you could link to the bucket directly - still can for content more then a couple years old)

reply

PostOnce 16 hours ago

Well, the business was acquired for $500,000,000 and a single employee probably costs what backing up two petabytes of data for a year (on glacier) does.

They could also always use tapes, for something as critical as the data that is the blood of your business.

Imagine if facebook lost everyones' contact lists, how bad would that be for their business? Backups are cheap insurance.

reply

FussyZeus 6 hours ago

Backups are still a hard sell for management, though. No matter how many companies die a quick and painful death when they lose too much business critical data, the bossmen just can't wrap their heads around spending $100k for what they perceive as no benefit.

Same problems with buying things like antivirus software or even IT management utilities; when they're doing their job, there's no perceivable difference. It's only when shit goes sideways that the value is demonstrated.

Hell you could take this a step further for IT as a whole; if IT is doing their job well, they're invisible. Then they can the entire department, outsource to offsite support, and the business starts hemorrhaging employees and revenue because nobody can get anything done.

reply

magduf 3 hours ago

>No matter how many companies die a quick and painful death when they lose too much business critical data, the bossmen just can't wrap their heads around spending $100k for what they perceive as no benefit.

Yeah, but what exactly IS the benefit? The business doesn't die if something really bad happens? Is that really important though?

Consider the two alternatives:

1) The business spends $x00k/year on backups. IF something happens, they're saved, and business continues as normal. However, this money comes out of their bottom line, making them less profitable.

2) The business doesn't bother with backups, and has more profit. The management can get bigger bonuses. But IF something bad happens, the company goes under, but then what happens to the managers who made these decisions? They just go on to another job at another company, right?

I'm not sure I see the benefit of backups here.

reply

FussyZeus 49 minutes ago

> Yeah, but what exactly IS the benefit? The business doesn't die if something really bad happens? Is that really important though?

I mean the way management gets on me when we have outages, you'd think that was a significant priority?

reply

softawre 2 hours ago

The managers that make these decisions need to have equity.

reply

ConceptJunkie 1 hour ago

I worked at a place that lost their entire CVS repository. The only reason they were able to restore it at all was because I made daily backups of the code myself. Sure, a lot of context data was still still lost, but at least there was some history preserved.

reply

antt 14 hours ago

They are expensive until the business goes bankrupt.

reply

ummonk 16 hours ago

88k per year per petabyte is a small price to pay to protect your entire business from being wiped out.

reply

OscarTheGrinch 12 hours ago

Devil's advocate: it depends on how many petabytes you have. This cloud of uncertainty over your uploads could be seen as the hidden cost of using a free platform.

reply

dotancohen 8 hours ago

> cloud of uncertainty

So far as Myspace (or Tumblr apparently) is concerned, it is "somebody else's computer of uncertainty".

reply

pmlnr 7 hours ago

There are Supermicro chassis' out there with 106x14TB drives in 4u, super deep racks.

1PB is nothing today.

reply

bufferoverflow 5 hours ago

Or they can just have their own backup solution for a lot cheaper. 8TB = $140 on Amazon.

1 petabyte = 125 drives = $17,500 (one-time cost).

It will probably cost more to connect all these drives to some sort of a server. Though 125 is within the realm of what a simple USB should be able to handle (127 devices per controller).

reply

lugg 13 hours ago

That's like a developer or two..

Wth?

reply

quotemstr 2 hours ago

So, roughly the cost of one or two good engineers? Not having backups is penny wise and pound foolish.

reply

ConceptJunkie 1 hour ago

"Penny wise and pound foolish" is the universal motto of management everywhere.

reply

zwily 7 hours ago

MFA delete at least doesn’t cost any extra.

reply

de_watcher 10 hours ago

They'll lose it as soon as they try to configure that.

reply

labster 18 hours ago

I'm surprisingly okay with this. Well, I guess I'd miss McMansion Hell.

reply

kevinmchugh 16 hours ago

McMansion Hell is now archived by the Library of Congress. Don't be too concerned.

http://mcmansionhell.com/post/181936133241/what-level-of-pos...

reply

ivm 17 hours ago

Thousands of skilled artists use Tumblr as their main publishing platform.

reply

lostlogin 16 hours ago

Picasso (supposedly) drew on a napkin, and Banksy draws on derelict walls or sticks his work through a shredder. The medium doesn’t need to be lasting. Edit: The potentially short-lived medium was chosen by the above artists. Tumblr users many not be too happy if work is lost.

reply

buboard 7 hours ago

banksy's walls are sold though; and he is still kind of the exception because of his art format. Not everything needs to be lasting but 100% temporary art is not common.

reply

criddell 7 hours ago

How many do you think they would be willing to pay some small monthly fee? I'm guessing most of them think their work is worth at least $5/month, right? Maybe Tumblr should become a paid service and ditch the advertising model. That way they could be more relaxed about what types of content they are willing to host.

reply

kirillzubovsky 17 hours ago

I’ve heard from Amazon friend that AWS as a whole is like that, one click away from a total meltdown. Probably true.

reply

stone-monkey 16 hours ago

That's basically what happened with S3 a couple years back. Mistyped command caused an outage for large parts of the internet in the US. Now, I dunno if they could make a big enough mistake that would bring down the whole company, but certainly it's been proven that a single mistake can affect major portions of the internet.

reply

antt 14 hours ago

I always find it funny how I'm designing with best practices in mind on top of infrastructure someone out of university build as their first project.

reply

nostrebored 13 hours ago

This is not the case with S3 and not the case with that incident.

reply

lugg 13 hours ago

Pretty sure there are first year grads who have worked on S3 as their first project.

reply

StavrosK 10 hours ago

So what? You're saying it as if they gave them root access to the servers and went "go nuts".

reply

lugg 6 hours ago

Bugs in code happen. You don't need write access to cause irreparable damage when the app you're working on has it.

reply

StavrosK 5 hours ago

This applies to everyone, juniors and seniors, and that's why we have code reviews, tests and tooling.

reply

karlkatzke 7 hours ago

Yeah, that's pretty much what major companies I've worked for will do with summer interns.

reply

StavrosK 7 hours ago

I don't know, that hasn't been my experience at all in the companies I've worked for (maybe because there's no way I'd let it happen).

reply

antt 12 hours ago

You can't prove me wrong since it's source is not available.

reply

klodolph 8 hours ago

I would believe that AWS is one click away from being unavailable for 12 hours, but not one click away from major irrecoverable data loss.

(Don't ask for a rigorous definition of "one click away", though.)

reply

dodobirdlord 15 hours ago

For most AWS services it would be fairly difficult to cause multi-region damage by mistake.

reply

nostrebored 13 hours ago

> experienced code reviewers verifying change sets using sophisticated deployment infrastructure targeting physical hardware spread out across one or more data centers in each availability zone

but the availability numbers speak for themselves :/

reply

tazjin 4 hours ago

My experience with Tumblr was generally that a large part of the content, especially larger media content like videos, failed to load most of the time. Makes me wonder if that's related ...

reply

alekratz 7 hours ago

This is fascinating. Are there any other crazy "wtf, how has this site not died yet" stories from the inside?

reply

scarface74 17 hours ago

I’m not saying it isn’t dumb, but that one fat finger would have to be

aws s3 rm bucket —-recursive

It won’t let you just go into the console or delete the stack that made it if the bucket isn’t empty.

reply

Dunedan 52 minutes ago

That's not accurate.

From the S3 management console user guide[1]:

> You can delete an empty bucket, and when you're using the AWS Management Console, you can delete a bucket that contains objects. If you delete a bucket that contains objects, all the objects in the bucket are permanently deleted.

[1]: https://docs.aws.amazon.com/AmazonS3/latest/user-guide/delet...

reply

VWWHFSfQ 17 hours ago

there was a S3 sync client that some people used that did:

aws s3 sync --delete ./ s3://your-bucket/

The delete flag was added by just a very innocuous checkbox in the UI. The result is that it removes anything not in the source directory. Kaboom. Everything's gone. The point is you have no idea what stuff is going to do even if you think it's obvious.

reply

electroly 5 hours ago

Have you tried this? It takes forever to clean out a bucket. At the scale we're talking about, doing this on a single thread from the CLI tool means you could go home and come back the next day and cancel it then, and you still wouldn't have made a particularly big dent in the bucket. It's really a pain in the neck to delete a whole bucket full of data when you actually want to. It's "easy" to start off a recursive delete, sure, but I think you're overestimating the "kaboom" factor.

reply

VWWHFSfQ 4 hours ago

not every business critical bucket has petabytes of data in it

reply

electroly 3 hours ago

This one does. We're talking about Tumblr.

reply

dahfizz 8 hours ago

Maybe the moral is that you shouldn't rely on third party clients for mission critical stuff if you dont know what they do.

reply

foxtrottbravo 4 hours ago

I know in the particular example that is something that's good advise and more or less easily done.

Do you think it would be good to extend said argument to say scp / ftp clients?

reply

VWWHFSfQ 8 hours ago

and also have backups like a normal competent person/organization does

reply

bashinator 3 hours ago

awscli is a first-party client.

reply

scarface74 1 hour ago

He mentioned a third party GUI wrapper on top of the CLI.

reply

PetahNZ 11 hours ago

This would take so many hours to actually run though, probably weeks for that amount of data.

reply

itronitron 11 hours ago

maybe someone at Tumblr can test this...

reply

aiven 10 hours ago

After porn ban they probably have only ~one petabyte.

reply

Reedx 14 hours ago

Did anyone in the company make a big deal out of this?

reply

cortesoft 16 hours ago

When was this? Being owned by Yahoo, I am surprised they don't use NetApp.

reply

zxcvbn4038 15 hours ago

Tumblr rejected all things Yahoo, except the money, so the answer to just about anything Yahoo asked was either “no”, “get stuffed”, or silence and a note to David that he needed to escalate to Marissa.

On the other side the Yahoo services were so heavily integrated that it was hard to carve out any piece of them, and the few times we tried it was a slow and painful process because Yahoo’s piece was glitchey and unreliable outside of it’s home turf and the Tumblr engineers defensive and argumentative about everything and not willing to help.

reply

zimpenfish 11 hours ago

> Tumblr rejected all things Yahoo

Having worked at Yahoo, I understand this stance.

reply

aasasd 11 hours ago

That's exactly how I imagined Tumblr's design and development, based on my multiple unsuccessful attempts, over the years, to find any useful navigation between blogs, or the function of reading comments.

reply

johnvanommen 5 hours ago

> When was this? Being owned by Yahoo, I am surprised they don't use NetApp.

Dell used to offer an online backup service. It wasn't even running on Dell equipment!

Basically they acquired a company that offered the service, and while it would be "nice" if a Dell company ran on Dell gear, a lot of the time it's simply impractical/expensive to overhaul things.

reply

soup10 16 hours ago

i do this too with my data on a smaller scale, but i'm suprised tumblr does this because even with only a few million files s3 buckets that big are awkward to work with

reply

Posted on def4

SubjectRepliesAuthor
o Anyone has stuff on tumblr ?

By: 3424234234 on Mon, 18 Mar 2019

03424234234
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor