Join Collections in Firestore
How do I perform a SQL JOIN in Firestore? - it’s a difficult question almost all developers will come across. The simple answer for ALL NoSQL databases is that it’s just not possible in an apples-to-apples way. We can’t perform this operation server-side, however, we can get clever with custom RxJS operators to solve similar problems - plus gain the added benefit of maintaining realtime listeners on all data.
Our operators require AngularFire and will add some RxJS magic to its existing Observables to tackle the challenge of joining Firestore documents and collections together easily in Firestore. The code in this lesson is advanced - if you get stuck checkout these related lessons:
Document to Document JOIN
Joining documents together is slightly easier than collections. A common data model in NoSQL is to save a field with a document ID that points to a related document. We could just perform separate document reads one-by-one, but that would be cumbersome. Let’s build an operator that can handle this process seamlessly as a single Observable.
This lesson uses strings to keep track of relational data, but you might also use a Firestore `DocumentReference` when dealing with large complex paths.
Usage
Consider the following data model. We have three unique docs in separate collections. The user doc makes a reference to a specific pet and car ID - this can be a has one or belongs to relationship.
+users
docId=jeff {
car: 'subaru'
pet: 'humphrey'
}
+pets
docId=humphrey {
type: 'dog'
food: 'kibble'
}
+cars
docId=subaru {
model: 'Legacy'
doors: 4
}
The first argument is the AngularFirestore
instance. Operators are pure functions, so they we have to pass dependencies as arguments. The second arg is an object where each key is the field with the related doc ID and the value is the collection containing this doc. The result is a joined object that joins the related document data as a object on each key.
const user$ = afs.document('users/jeff').valueChanges()
const joined = user$.pipe(
docJoin(afs, { car: 'cars', pet: 'pets' } )
)
///// result /////
{
...userData
pet: { type: 'dog', food: 'kibble' },
car: { model: 'Legacy', doors: 4 }
}
docJoin Code
The code for this custom operator can be broken down into three major steps.
- Retrieve the data from the parent document, save it to an internal variable so it can be combined with the joins.
- switchMap to the doc reads of the relational data and combine them with combineLatest. This will wait for all reads to finish before any data is emitted.
- Map the joins to the parent document as a single object Observable.
import { AngularFirestore } from 'angularfire2/firestore';
import { combineLatest, defer } from 'rxjs';
import { map, switchMap } from 'rxjs/operators';
export const docJoin = (
afs: AngularFirestore,
paths: { [key: string]: string }
) => {
return source =>
defer(() => {
let parent;
const keys = Object.keys(paths);
return source.pipe(
switchMap(data => {
// Save the parent data state
parent = data;
// Map each path to an Observable
const docs$ = keys.map(k => {
const fullPath = `${paths[k]}/${parent[k]}`;
return afs.doc(fullPath).valueChanges();
});
// return combineLatest, it waits for all reads to finish
return combineLatest(docs$);
}),
map(arr => {
// We now have all the associated douments
// Reduce them to a single object based on the parent's keys
const joins = keys.reduce((acc, cur, idx) => {
return { ...acc, [cur]: arr[idx] };
}, {});
// Return the parent doc with the joined objects
return { ...parent, ...joins };
})
);
});
};
Collection Join (Left Join SQL)
One of most useful types of joins in a SQL database is a LEFT JOIN, which gives us all records from the first table, then any matching records from another table based on a shared key.
We can achieve a similar style query in Firestore when documents from two collections share a common key-value pair, similar conceptually to a primary and foreign key in SQL. In this example, it will replace the keys on the left collection with the query data from the right collection.
Usage
Let’s say we have a database structure that looks something like this below.
+users
docId=jeff {
userId: 'jeff'
...data
}
+orders
docId=a {
orderNo: 'A'
userId: 'jeff'
}
docId=b {
orderNo: 'B'
userId: 'jeff'
}
Our goal is to query the users collection, then query each orders collection for every doc, for instance: orders.where('userId', '==', userId)
.
The operator has three required arguments and an optional limit leftJoin(afs, joinKey, joinCollection, limit=100)
.
const users$ = afs.collection('users').valueChanges()
const usersOrders$ = users$.pipe(
leftJoin(afs, 'userId', 'orders')
)
///// result: users collection with orders array on each doc /////
[
{ ...user1Data, orders: [{ orderNo: 'A', userId: 'jeff' }, { orderNo: 'B', userId: 'jeff' }]},
{ ...user2Data, orders: [...]}
]
leftJoin Code
The code for a collection join is slightly more complex, but overall pattern is identical to the previous operator. Let’s break them down again:
- Call switchMap on the initial query.
- Loop over documents in that query, setting up a secondary query based on the common key-value pair between the documents. Perform all join queries together with combineLatest.
- Map them join queries to documents in the original query.
I've also added custom logging to this operator so we know the total documents in the joined query. In the console, it will log something like **Queried 25, Joined 100**, so you know that 125 reads were executed.
import { AngularFirestore } from 'angularfire2/firestore';
import { combineLatest, pipe, of, defer } from 'rxjs';
import { map, switchMap, tap } from 'rxjs/operators';
export const leftJoin = (
afs: AngularFirestore,
field,
collection,
limit = 100
) => {
return source =>
defer(() => {
// Operator state
let collectionData;
// Track total num of joined doc reads
let totalJoins = 0;
return source.pipe(
switchMap(data => {
// Clear mapping on each emitted val ;
// Save the parent data state
collectionData = data as any[];
const reads$ = [];
for (const doc of collectionData) {
// Push doc read to Array
if (doc[field]) {
// Perform query on join key, with optional limit
const q = ref => ref.where(field, '==', doc[field]).limit(limit);
reads$.push(afs.collection(collection, q).valueChanges());
} else {
reads$.push(of([]));
}
}
return combineLatest(reads$);
}),
map(joins => {
return collectionData.map((v, i) => {
totalJoins += joins[i].length;
return { ...v, [collection]: joins[i] || null };
});
}),
tap(final => {
console.log(
`Queried ${(final as any).length}, Joined ${totalJoins} docs`
);
totalJoins = 0;
})
);
});
};
The End
We built two unique RxJS operators that perform joins in Firestore and solve similar problems to those in SQL databases. Keep in mind, there is no one right way to do this - consider it a starting point to create abstractions around your own business logic.