Some very rough notes from MongoDB London 2013:
Session 1 - Performance
Keep indexes in memory
Data in memory if you can
Slow queries can be configured to appear in logs
Use SSDs
Growing documents is bad
Do an ‘explain' on queries
Padding factor
DB locks when writing
Sharding to scale writes
Optionally read from slaves but they may not have the written data yet
Write concern level configurable
Can set importance level of writes based on the node that has acknowledged the write
You can define your own _id structure to help querying
Use short field names - use an abstraction layer
Covered indexes
Dropping collections is faster than removing
mongostat
Run your own benchmark - benchrun
serverdensity.com/mdb
@davidmytton
Document per day, pre allocated then use inc operator
Session 2 - Backups
<p lang="en-US">
<p lang="en-US">
bsondump converts bson to json
</p>
<p lang="en-US">
Use journalling
</p>
<p lang="en-US">
Disk backups faster
</p>
<p lang="en-US">
<p lang="en-US">
<p lang="en-US">
TTL indexes and capped collections
</p>
<p lang="en-US">
<p lang="en-US">
<p lang="en-US">
<span style="font-weight: bold;">Replication</span>
</p>
<p lang="en-US">
An uneven number of nodes is advised
</p>
<p lang="en-US">
There's a mesh of hearbeats between the nodes
</p>
<p lang="en-US">
An arbiter node only exists for voting - it stores no data
</p>
<p lang="en-US">
You can have hiddden nodes, for "backup" purposes only
</p>
<p lang="en-US">
You can give a node a slaveDelay so the replication is delayed
</p>
<p>
Servers can be tagged e,g, { datacenter: new york }
</p>
<p>
There are 5 read preference modes
</p>
<p>
You can test all this on a single machine
</p>
<p>
Failure points:
</p>
<ul type="disc">
<li>
Power
</li>
<li>
Network
</li>
<li>
Data Center (5 nodes safest, 2 (primary) +2 (primary)+1 (backup DC)
</li>
<li lang="en-US">
Multi-node failure can occur e.g. 2 out of 3 fail
</li>
</ul>
<p lang="en-US">
When there's only one node, the whole cluster becomes read-only
</p>
<p lang="en-US">
You can disabled indexing if you want to, e.g. on a backup node that isn't ever queried
</p>
<p lang="en-US">
OpenStreetMap data contains lots of Points Of Interest, e.g. pubs
</p>
<p lang="en-US">
MongoDB can be used with Hadoop
</p>
<p lang="en-US">
There's a mongo-storm project
</p>