Arindam Paul

Music & Technology

How a Rare Book Became Apple's Most Beautiful App

The origin story of Apple Music Classical — and why classical music broke every search system we tried to build for it before Primephonic changed the model entirely.

5M+ Classical Recordings
100+ Classical Labels Licensed
2021 Apple Acquisition of Primephonic
2023 Apple Music Classical Launch

It was 2018, and we were three weeks into a contract with a startup called Primephonic — a classical music streaming service with a grand ambition and a search experience that, honestly, was not much better than guessing. I was at GlobalOrange in Amsterdam, researching how classical music had been catalogued historically, looking for anything that treated the repertoire with the structural rigor we needed. That is when I discovered it: a printed catalog called The Da Capo Catalog of Classical Music Compositions — dense, comprehensive, almost Victorian in its obsessive detail — listing the complete works of the classical masters. Composers, their opus numbers, their key signatures, the orchestras that had recorded their symphonies, the conductors who had shaped those recordings over a century.

I brought it to the team and said this is what we need to build from. GlobalOrange purchased the rights from the author, Jerzy Chwialkowski, to use the catalog's data as the foundation for our metadata system. I flipped through those 1,399 pages and felt something click into place. This was not a book. This was a data model waiting to be built.

That moment is where Apple Music Classical actually begins, though neither Apple nor anyone else knew it yet.

Why Classical Music Breaks Normal Search

When Spotify or Tidal catalog a pop song, the data model is simple: artist, album, track, release year. Maybe a genre tag. That is sufficient because a pop song has one artist, usually one version, and the song title is the primary identifier. Search for "Bohemian Rhapsody" and you get Queen. Done.

Now try to find a specific recording of Beethoven's Ninth Symphony. The composer is Beethoven — dead since 1827, not the performing artist. The performing artist is the Berlin Philharmonic, conducted by Herbert von Karajan, recorded in 1963, featuring soprano Gundula Janowitz. But there are also the 1962 recording, the 1977 recording, and the 1984 recording — all from the same conductor, same orchestra, meaningfully different performances. A classical listener might want just the fourth movement — the famous choral finale — and searching "Beethoven Ninth fourth movement Karajan Berlin" should work. On Spotify in 2018, it largely did not.

The Data Model Problem: Pop vs. Classical

Pop Music 5 fields Artist Queen Album A Night at the Opera Track Bohemian Rhapsody Year 1975 Genre Rock Artist = performer = primary search key One canonical version per track Title uniquely identifies the work Classical Music 9+ fields Composer Beethoven Work Symphony No. 5 Opus / Key Op. 67 · C minor Movement I. Allegro con brio Conductor Herbert von Karajan Orchestra Berlin Philharmoniker Rec. Year / Venue 1962 · Jesus-Christus-Kirche Label Deutsche Grammophon more fields

Why the Data Model Is the Product

A pop track needs five fields: artist, album, title, year, genre. A classical recording needs at minimum nine distinct identity layers — Composer, Work, Opus or Catalogue number, Movement or Act, Recording, Conductor, Orchestra, Soloists, and Year. Collapse any two of those layers into a flat text field and your search engine becomes useless to anyone who actually knows classical music.

This is why every streaming platform before Primephonic treated classical as a second-class genre. It was not indifference. It was architecture. The data model required to serve classical listeners correctly did not exist in any commercial streaming context — until we built it.

The problem is structural. Classical music has a hierarchy that pop metadata was never designed to represent: Composer, then Work, then Opus and Catalogue number, then Movement or Act, then specific Recording, then Conductor, then Orchestra, then Soloists, then Year. A search engine that treats all of those as flat text fields produces results that are, to a classical listener, worse than useless. They are insulting.

Primephonic had licensing agreements with hundreds of classical labels. They had the audio. What they did not have was a data architecture that could make that audio findable in the way classical listeners actually think about music.

Classical music has a hierarchy that pop metadata was never designed to represent. A search engine that flattens those layers produces results that are, to a classical listener, worse than useless.

What Happens When You Search

The clearest way to understand why Primephonic was different is to watch what happens when the same query hits two different systems. Type "Beethoven Symphony 5 Karajan" into a platform built on pop infrastructure and watch the results come back: a flat text match across all metadata fields simultaneously, returning everything that contains any of those words, in any field, in any order. You get recording compilations with "Karajan" in the liner notes, tribute albums titled "Beethoven's Greatest," and a Spotify playlist called "5 Hours of Classical for Study." What you do not reliably get is the specific Karajan recording you were looking for.

What Happens When You Search "Beethoven Symphony 5 Karajan"

Search query "Beethoven Symphony 5 Karajan" Spotify / Pop Platform Step 1 Tokenize query into words Beethoven Symphony 5 Karajan Step 2 Match ALL fields simultaneously (flat) ✕ "5 Hours of Classical for Study" — playlist ✕ Beethoven's Greatest — compilation (no Karajan) ✕ Karajan Conducts — liner note text match ✕ Symphony No. 5 — wrong conductor Result: Irrelevant matches. User abandons search. No understanding of composer vs. conductor vs. work hierarchy Primephonic Step 1 Decompose query by semantic role composer: Beethoven work: Sym. No. 5 conductor: Karajan Step 2 Match each token against its typed field ✓ Sym. No. 5 · Karajan · Berlin Phil · 1962 ✓ Sym. No. 5 · Karajan · Berlin Phil · 1963 ✓ Sym. No. 5 · Karajan · Berlin Phil · 1977 ✓ Sym. No. 5 · Karajan · Vienna Phil · 1984 Result: Exact recordings returned in chronological order. Composer / Work / Conductor understood as distinct identity layers

Primephonic understood the query differently. It decomposed "Beethoven Symphony 5 Karajan" into three typed entities: composer equals Beethoven, work equals Symphony No. 5, conductor equals Karajan. The search then matched each entity against its own field in the hierarchy, returning all four of the main Karajan recordings of that symphony in chronological order, with recording dates, orchestras, and labels clearly surfaced. The user found exactly what they were looking for in the first three results.

Building the Database

The rare Da Capo catalog book gave us our schema. We spent weeks reverse-engineering it into a proper relational and document model — composers and their complete works, canonical opus numbers cross-referenced with alternate catalogue systems like the Köchel catalogue for Mozart or the BWV numbering for Bach. We built a hierarchy: Work at the top, then recordings of that work, each recording linked to its conductor, its ensemble, its soloists, its recording date, its label. The full story of how we parsed those 1,399 pages into a structured database is covered in the Da Capo parsing article.

This was not a weekend project. Classical music has centuries of accumulated recordings, multiple competing cataloguing traditions, inconsistent naming conventions across labels, and the added complexity that many works exist in versions — a composer's revised edition, a different orchestration, a piano reduction. We had to make decisions about canonical identity: is a string quartet arrangement of a piano sonata the same work? It is not, and the database had to know that.

You cannot build a good classical music search engine without genuinely learning the domain. The data has to reflect those distinctions because the users will absolutely notice if it does not.

The search logic we built on top of this was designed to understand the hierarchy. If you searched for "Karajan Beethoven," you wanted Karajan's recordings of Beethoven's works — not a flat text match across all metadata fields. If you searched for "Symphony No. 5," you wanted to be asked which composer, because a hundred composers wrote a piece called Symphony No. 5. The system had to handle natural language queries, abbreviations, common misspellings of German and Italian composer names, and the fact that some conductors and orchestras are famous enough that users search only by those.

On the engineering side, I built the database on MongoDB, which gave us the document flexibility to model the hierarchical classical data without forcing it into a rigid relational schema. The API layer was Java and Spring Boot, with a search index tuned specifically for the classical naming patterns we had catalogued. Authentication and payment infrastructure ran on top: single sign-on, Apple Pay, Google Pay, and Adyen for payment processing — because Primephonic was a paid subscription service and the checkout flow had to be seamless enough that a person who just discovered a Schubert recording they desperately wanted to hear would not abandon their cart.

5M+
Classical recordings now served on Apple Music Classical — built on the architecture Primephonic established in Amsterdam in 2018

The Journey — Amsterdam to Apple

2018 Primephonic contract begins GlobalOrange, Amsterdam. Primephonic engages the team to rebuild their search and metadata systems from scratch. The brief: make classical music findable in the way listeners actually think about it. 2018 The Da Capo catalog discovered 1,399 pages. Complete works of the classical masters: composers, opus numbers, key signatures, conductors, orchestras. GlobalOrange purchases rights from author Jerzy Chwialkowski. This book becomes the data schema. 2018–19 Metadata database and search engine built Hierarchical document model in MongoDB. Composer → Work → Opus/Catalogue → Movement → Recording → Conductor → Orchestra → Year. Search index tuned for classical naming patterns, abbreviations, and multi-language misspellings. MongoDB · Spring Boot · Java · Adyen · AWS · CircleCI 2019 Primephonic launches with purpose-built search First commercial classical music streaming platform with a dedicated hierarchical metadata model. Classical listeners who had given up on streaming are finding recordings. Critics and music librarians take notice. Aug 2021 Apple acquires Primephonic Terms undisclosed. Apple's stated intent: relaunch as a dedicated classical music app within Apple Music. The metadata architecture travels with the acquisition. Mar 2023 Apple Music Classical launches globally 5M+ recordings. Available on every iPhone worldwide. The search model that began in an Amsterdam office is now running at Apple scale.

What We Built Was the First of Its Kind

I want to be precise about this because precision matters: when we launched the Primephonic search and database in production, it was the first purpose-built classical music metadata database and search engine in a commercial streaming context. Every other classical music offering in the streaming world was built on pop metadata infrastructure with classical music awkwardly shoehorned in. Primephonic was built for classical first, from the ground up.

The response from classical music listeners was immediate and genuinely moving. People who had given up on streaming because the existing platforms could not find the recordings they wanted were suddenly finding them. Music librarians wrote about it. Critics wrote about it. The classical community is small and passionate and extremely vocal when something gets it right, and Primephonic was getting it right in a way that had not existed before.

Domain depth is not a soft skill. It is a structural competitive advantage — and the only approach that produces software worth acquiring.

August 2021

I remember exactly where I was when the news broke. Apple announced the acquisition of Primephonic in August 2021. The terms were not disclosed. What was disclosed was Apple's stated intent: to relaunch Primephonic as a dedicated classical music app within Apple Music.

There is a specific feeling that comes when a company the size of Apple validates something you built. It is not quite pride, because pride would be too simple. It is more like confirmation — the verification that the problem was real, that the solution was sound, that the years of domain work and technical precision had produced something genuinely worth acquiring. I had moved on from the project by then, but the work had not moved on from me. You do not spend years cataloguing the complete works of Beethoven and Bach and Brahms and walk away unchanged.

Apple relaunched the service as Apple Music Classical in March 2023. The app currently serves more than five million classical recordings. The metadata architecture and the search model that began with a rare catalog book in an Amsterdam office is now running at Apple scale, for millions of users, on every iPhone in the world.

What This Taught Me About Domain Depth

The lesson I carry from Primephonic is not about Java or MongoDB or Spring Boot. Those were means. The lesson is about domain depth as a competitive engineering advantage — something I have found to be true across every industry I have worked in.

Most software engineers, including very good ones, approach a new domain as a requirements specification to be fulfilled. The requirements say "search by composer" so you build a composer field. But if you actually understand classical music — if you understand that a composer is not equivalent to a pop artist, that an opus number is a primary key with exceptions, that conductor identity changes the meaning of a recording — then you build something that goes far beyond the specification. You build something that earns the trust of users who know the domain intimately and will immediately see through anything superficial.

I got that depth partly from the rare book, partly from the people at Primephonic who had dedicated their careers to classical music, and partly from a genuine willingness to sit with a problem long enough that it stopped feeling foreign. That is slower than the usual approach. It is also, in the long run, the only approach that produces software worth acquiring.


Apple Music Classical exists because a small team in Amsterdam took classical music seriously enough to build something that had never existed before. Somewhere right now, a person is finding a recording of a Schubert string quartet they have been looking for for twenty years, because we built a database that could understand what they were actually asking for. That is what software is supposed to do.

Arindam Paul — Software Engineer, Amsterdam