The most remarkable legacy system I have seen

user5994461 | 526 points

I'll share an anecdote along the same lines, albeit much smaller in scale:

In 2004 an acquaintance asked me for help with sharing an Internet connection to the residents of his condominium after he had failed getting a common router/switch solution to work. The router was not playing ball unless all clients presented themselves in one and the same subnet, which prompted the unmanaged switch to pass traffic directly between the ports, and that was a no-no for historic reasons relating to Windows and its malware of the time. So I repurposed an old anno 1999 ATX motherboard with a mix of Ethernet cards - the board offered 6 PCI ports and the condominium has 5 residents - 256 MB of RAM, and a low-power passively cooled Pentium III to act as router and switch in one, running OpenBSD with some dhcpd(8) and pf(4) to manage clients and traffic.

16 years later this set-up is still in use 24/7 due to ISPs claiming logistic problems preventing them from installing gateway equipment in the building. It serves a VDSL2 line of 60/20 Mbit/s coming in over the POTS. At its peak the computer had 6 years of uptime. It is still running OpenBSD 3.5 due to poor #OPSEC on my behalf.

daneel_w | 4 years ago

As the comment by timsworkaccount says, this is definitely Athena at JPMorgan.

For those interested, I gave a talk on Athena at PyData UK 2018 called "Python at Massive Scale". 4500 developers making 20k commits a week. Codebase with 35m LOC.

The video is here: https://www.youtube.com/watch?v=ZYD9yyMh9Hk

It covers Athena's origins, what it is used for, application architecture, infrastructure, dev tooling and culture.

stevesimmons | 4 years ago

I have seen stuff done two decades ago in the defense industry which put to shame some of the projects I am working on nowadays: extremely modular architecture, very good service-oriented framework with hot-swapable and hot-reloadable components, automatic distribution and automatic redundancy for fault tolerance.

There really are some gems out there.

brmgb | 4 years ago

I had the misfortune to work on (and re-write) a legacy system in a previous job and I constantly alternated between admiration and horror.

In some places the web GUI was precisely designed exceptionally well, items placed together precisely, all designed expertly around the specific domain of concern.

On the other hand, there were race conditions, corrupted data, sql injection, mountains of source files that re-implemented the same things, IE6-7 era compatibility issues, etc.

Then there was the mysterious parts of the program which I never understood, like the auto-email capabilities whose scripts could never be located, mysterious mirror servers that I would figure out existed by looking at the ip addresses of odd requests, little bits of domain logic which I would accidentally break and have to cobble back together.

In a lot of ways, many of the problems just stemmed from age, it was an early 90's application in an early-2010's world. There definitely were some not ideal software development practices that contributed to difficulties as well. But I still had a sense that the previous developer had crafted this beautiful, unique, intricately complex and inter-dependent little world.

Sometimes I was sad I had to brush all of that aside even when my changes made things more robust, reliable and compatible with the modern world.

InfiniteRand | 4 years ago

We often call things legacy because they were made a while ago, or the person who made it isn't at the organisation anymore, or — frankly — because we weren't involved in making it. But those are just proxy indicators for the true indicator of legacy technology: how difficult it is to change.

timwis | 4 years ago

Mid-aughts I created a thing for healthcare data exchange. Inspired by postfix. Today you'd probably compare it to AWS Lambda and Kinesis. But radically more useful, useable. At the time, I considered it the anti-J2EE.

We could onboard "interface developers" in a week. As in first time seeing Java and deploying code to production. Normal onboarding time was 3-6 months. We figured we had 10 year backlog of work using the more traditional tools (eg BizTalk, SeeBeyond). Our firm awarded $20k bonus for hiring referrals, demand was so great.

So my stack was a serious competitive advantage.

Alas, it was too simple. Our startup got acquired by Quest Diagnostics. They loved themselves some InterSystmes Caché. Their PHBs (Directors and VPs opposed to the acquisition) just couldn't handle that my stuff ran circles around theirs. So of course it had to be killed in the crib.

FWIW, technically, an "interface" (what healthcare calls something that munges HL7 or equiv) was just self contained code. Straight up data processing. Input, transform, output. Pass in a stream and some context, stuff comes out. Compose these snippets just like in Unix. In development, you'd just run it from the command line (or IDE). In production, we had a spiffy process runner with a spiffy management UI. Built-in logging, metrics, dashboards, etc.

I did some AWS Lambda stuff at my last gig. I absolutely fucking hated it. The managed scaling is maybe nice, but table stakes these days. The programming model is just a kick in the berries.

PS- Word about InterSystems. Of course I tried to be a good soldier. My architecture was more important than the implementation language, persistence engine; I really didn't didn't care what tools we used. Oh boy. Caché might be the worst tool I've ever used. For example, one time a compiler error bricked my entire dev runtime, unrecoverable. The Caché partisan mocked me "What did you do?!" Um, a typo. "Duh, don't do that!!" Apparently Caché self immolation is normal. So keep regularly image snapshots. At the time there was no version control options; you'd export "source" and hope it'd reimport later. Ludicrous.

specialist | 4 years ago

I've worked at a few banks and early fintechs, starting 30 years ago. I've seen some very remarkable systems well ahead of their time:

realtime distributed stream processing in the late 90s

complex event processing before it was called that

distributed application frameworks

fnord77 | 4 years ago

By the title, I was ready to read an article about software that was created in 70's or 80's. Nope, 2008.

werdnapk | 4 years ago

At my work, we have (all in-house and most are more than twenty years old):

  - A (much more complicated) make clone
  - A test running framework
  - A remote session tool
  - A test specification framework
  - A preprocessing tool
  - At least six domain-specific languages for specifying compilers, assemblers, linkers, simulators.
Of course, you also need to know Linux, bash, C, C++, Perl, and Python.

Needless to say, it takes some time to get up to speed. On the other hand, you can run some very simple commands and have a bunch of servers run hundreds of thousands of tests on your code, on different OSes.

rustybolt | 4 years ago

Sounds like Athena (JPM) or Quartz (BAML). Though as I understand those have a lot more official buy-in than the article suggests.

jackric | 4 years ago

But... if I cannot describe it as a buzz word on my resume, clearly it is worthless.

bbarnett | 4 years ago

I'm sorry but I cannot, in my right mind, recommend anyone ever use Tornado. Not least because python now natively supports async/await. There was a thing.. an unholy union of threads, sleeps, and tornado.gen co-routines. It would always break. How could it not when mixing every concurrency paradigm imaginable. When it did everything started burning. Data stopped flowing. Many an engineer tried and failed to fix it. Have you ever seen three TLS client hellos on the same TCP connection? I have. Tornado is actually the most accurate name you can give the framework: a big whirling mess. Maybe I'm being too hard on tornado, but I kinda blame tornado for introducing co-routines before python was ready for it. The company has since moved on to golang, but the tornado is still whirling and will be for years to come.

dcow | 4 years ago

Also at a bank, a file/database server that had two physical sides. Each side could grab a cartridge containing basically a cd-r, push it into a reader, read the track of data based on a file system like record in an oracle database. This was done to meet SEC Rule 17a-4 (Write Once Read Many) requirements.

It was old, like "Side A got stuck have to run to the DC to fix it" old and you would get errors accessing those files but not side B files. I'm guessing 15+ years. Weirdly you can grab a PDF off this thing in less than 2 seconds. This had to exist at other organizations but this was the only one I saw.

Now Azure and AWS provide the same service for pennies on the dollar.

jabart | 4 years ago

I never personally developed on Athena but I remember it requiring AIM (an internal distributed file system sort of like NFS built on Hadoop that had a fairly complex data model). All the Python libraries were compiled and linked against the AIM fs and required some shenanigans to import modules making the code not very portable

nijave | 4 years ago

> Maybe one third of internal apps developed in the last years don’t work at all outside of Chrome, the page remains blank or has broken widgets all over the place.

Does this statistic match what others have seen? If so, that’s staggering, and unfortunate. I know there are more sites that do this than there used to be, but I wasn’t under the impression it was such a large proportion.

willj | 4 years ago

I don't think something created in 2008 can be considered a legacy system

peterkelly | 4 years ago

Well, this may be one of the more interesting and useful threads I've seen here in a while: Systems that are insanely robust; Well engineered; And have withstood the test of time among an ocean of hype. You don't see that very often. +1 for everyones stories

Uptrenda | 4 years ago

You've made an old guy feel even older LOL!

A couple of years ago I inherited a software engineering department, near the bottom of a downward spiral shipping 25+ years of VB6 code to customers. Brought in external help, and we spent a year and a half rebuilding a modern version. During that project, I saw the most awful code on a daily basis. I still have nightmares about it!

binarysneaker | 4 years ago

I had something similar, currently running 3k "lambdas" of rules to check if a user action is fraud or not. Running 60k qps with 32 nodes, Python2+NSQ.

A new "lambda" takes at about 1-60 seconds to propagate to all nodes.

est | 4 years ago

The article does not make it immediately obvious what language version the original made-in-2008 product was written in. Python 2.7 was not released until two years later, in summer 2010.

daneel_w | 4 years ago

If it was only so easy. In the real world, IE might still be required for some obscure critical applications that are difficult and/or expensive to replace.

doggydogs94 | 4 years ago

I'm surprised the SOX auditors didn't shut this down. Its the type of stuff the ITIL fanatics hate and makes IBs toxic places to work.

x87678r | 4 years ago

Fascinating story, but then it ends with a bust: 5 seconds to do anything? Seems like a show-stopper for real work...

cpr | 4 years ago

This is almost certainly JPMC. I wonder how many systems are also relying on COBOL/mainframes?

tomrod | 4 years ago

Nice

fabrijunca | 4 years ago

Reads cool until you reach the "5 seconds baseline" part.

qwerty456127 | 4 years ago