2014-07-03-mongodb-london-2013

Some very rough notes from MongoDB London 2013:

Session 1 - Performance

Keep indexes in memory

Data in memory if you can

Slow queries can be configured to appear in logs

Use SSDs

Growing documents is bad

Do an ‘explain' on queries

Padding factor

DB locks when writing

Sharding to scale writes

Optionally read from slaves but they may not have the written data yet

Write concern level configurable

Can set importance level of writes based on the node that has acknowledged the write

You can define your own _id structure to help querying

Use short field names - use an abstraction layer

Covered indexes

Dropping collections is faster than removing

mongostat

Run your own benchmark - benchrun

serverdensity.com/mdb

@davidmytton

Document per day, pre allocated then use inc operator

Session 2 - Backups

<p lang="en-US">
  <p lang="en-US">
    bsondump converts bson to json
  </p>
  
  <p lang="en-US">
    Use journalling
  </p>
  
  <p lang="en-US">
    Disk backups faster
  </p>
  
  <p lang="en-US">
    <p lang="en-US">
      <p lang="en-US">
        TTL indexes and capped collections
      </p>
      
      <p lang="en-US">
        <p lang="en-US">
          <p lang="en-US">
            <span style="font-weight: bold;">Replication</span>
          </p>
          
          <p lang="en-US">
            An uneven number of nodes is advised
          </p>
          
          <p lang="en-US">
            There's a mesh of hearbeats between the nodes
          </p>
          
          <p lang="en-US">
            An arbiter node only exists for voting - it stores no data
          </p>
          
          <p lang="en-US">
            You can have hiddden nodes, for "backup" purposes only
          </p>
          
          <p lang="en-US">
            You can give a node a slaveDelay so the replication is delayed
          </p>
          
          <p>
            Servers can be tagged e,g, { datacenter: new york }
          </p>
          
          <p>
            There are 5 read preference modes
          </p>
          
          <p>
            You can test all this on a single machine
          </p>
          
          <p>
            Failure points:
          </p>
          
          <ul type="disc">
            <li>
              Power
            </li>
            <li>
              Network
            </li>
            <li>
              Data Center (5 nodes safest, 2 (primary) +2 (primary)+1 (backup DC)
            </li>
            <li lang="en-US">
              Multi-node failure can occur e.g. 2 out of 3 fail
            </li>
          </ul>
          
          <p lang="en-US">
            When there's only one node, the whole cluster becomes read-only
          </p>
          
          <p lang="en-US">
            You can disabled indexing if you want to, e.g. on a backup node that isn't ever queried
          </p>
          
          <p lang="en-US">
            OpenStreetMap data contains lots of Points Of Interest, e.g. pubs
          </p>
          
          <p lang="en-US">
            MongoDB can be used with Hadoop
          </p>
          
          <p lang="en-US">
            There's a mongo-storm project
          </p>

tjrobinson.net

Explorer

2014-07-03-mongodb-london-2013