<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Shaun Abram &#187; database</title>
	<atom:link href="http://www.shaunabram.com/tag/database/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.shaunabram.com</link>
	<description>Java and Technology weblog</description>
	<lastBuildDate>Thu, 09 Feb 2012 18:06:07 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>OSCON Day3: Database Scalability</title>
		<link>http://www.shaunabram.com/oscon-day3-database-scalability/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=oscon-day3-database-scalability</link>
		<comments>http://www.shaunabram.com/oscon-day3-database-scalability/#comments</comments>
		<pubDate>Sat, 24 Jul 2010 23:48:55 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[oscon]]></category>

		<guid isPermaLink="false">http://www.shaunabram.com/?p=878</guid>
		<description><![CDATA[I spent the afternoon of Day3 at OSCON attending two interesting database scalability talks. The first was on Database Scalability Patterns; The second on Database Sharding (and Spider for MySQL). All my notes are below&#8230; Database Scalability Patterns The first talk on &#8220;Database Scalability Patterns&#8221; was by Robert Treat from OmniTI. Database scalability patterns are [...]]]></description>
			<content:encoded><![CDATA[<p>I spent the afternoon of Day3 at <a href="http://www.oscon.com/oscon2010">OSCON</a> attending two interesting database scalability talks. The first was on Database Scalability Patterns; The second on Database Sharding (and Spider for MySQL).</p>
<p>All my notes are below&#8230;</p>
<p><span id="more-878"></span></p>
<h2>Database Scalability Patterns</h2>
<p>The first talk on &#8220;<a href="http://www.oscon.com/oscon2010/public/schedule/detail/13226">Database Scalability Patterns</a>&#8221;<br />
was by Robert Treat from <a href="http://omniti.com/">OmniTI</a>.</p>
<p>Database scalability patterns are part design patterns and part application life cycle.</p>
<h4>Phases</h4>
<p>Typically, most databases go through the following phases:<br />
1. MyFirstDatabase<br />
2. Vertical partitioning<br />
3. Vertical scaling<br />
4. Read slaves<br />
5. Horizontal partitioning</p>
<h6>1. MyFirstDatabase </h6>
<p>MyFirstDatabase is just a term to refer to a simple, unsophisticated database setup.</p>
<h6>2. Vertical Partitioning</h6>
<p>You now have much more data and more transactions.<br />
The <a href="http://en.wikipedia.org/wiki/Partition_%28database%29">Wikipedia definition</a> of Vertical partitioning is creating tables with fewer columns and using additional tables to store the remaining columns.  </p>
<h6>3. Vertical Scaling</h6>
<p>Adding more RAM, more disks. Basically, adding more hardware.<br />
This is not necessarily a one time deal as you can do multiple iterations.</p>
<h6>4. Read Slaves</h6>
<p>&#8220;Read Slaves&#8221; or &#8216;Master &#8211; Slave&#8221;<br />
I think the idea here is that you have a single destination for database writes, but multiple sources for database reads. These &#8216;slaves&#8217; read database are replicated form the single write source.<br />
This approach however does not work for write bottle necks! It only works for read heavy loads.<br />
Typically<br />
- full copy of data on each node<br />
- Asynchronous<br />
Consider<br />
- Partial copy<br />
- Synchronous<br />
- Perhaps don&#8217;t use a RDBMS</p>
<p>Also note that the Master/Slave setup typically requires changes to your application code. But overall, the Master/Slave approach is fairly easy</p>
<p>Note that scaling large number of database writes can be much more difficult. One approach is to use multiple Masters, but while there are many ways to implement multiple masters, there are few that really work in a production environment! You may be able to reduce CPU, but it is difficult to reduce the I/O. Really, the master slave approach is is a fail-over solution, not a scalability solution.</p>
<h6>5. Horizontal partitioning</h6>
<p>According to <a href="http://en.wikipedia.org/wiki/Partition_%28database%29">Wikipedia</a>, Horizontal partitioning is putting different rows into different tables. It is sometimes referred to as Sharding.<br />
- Move each piece to own server<br />
- Duplicate some data as needed<br />
- When splitting the data, you must separate dependencies in the app code first!<br />
Note that each each node is a new instance of vertical scaling</p>
<h4>&#8220;Universal truths&#8221; of scaling databases</h4>
<p>1) Vertical scalability is helpful for every pattern<br />
Even in a horizontally scaled, fully distributed database, the number of nodes needed is affected by vertical scalability<br />
2) New nodes are never free<br />
-Adds points of failure<br />
-Add management costs<br />
-Add complexity to architecture<br />
-Add complexity to your app code</p>
<h4>Tips</h4>
<p>- Plan for layered data sources<br />
- Read/write connections in code i.e. have separate read vs write connections in the code to start with. Then you can have planned outages where you can still read, but not do updates.<br />
- Use schemas to separate services (think about what pieces of data need to talk/be aware of each other, and what do not)</p>
<h2>Sharding</h2>
<p>The second database scaling talk was on &#8220;<a href="http://www.oscon.com/oscon2010/public/schedule/detail/14059">Sharding for the Masses</a>&#8221; by Giuseppe Maxia (MySQL @ Oracle)</p>
<p>He simply described sharding as &#8220;breaking a database into pieces&#8221;.<br />
It is used simply as a way of dealing with too much data &#038; traffic.</p>
<h4>Replication</h4>
<p>One approach to scaling is replication.<br />
Client sends a write to master and the reads to a load balancer.<br />
The write master and the read load balancer distribute to slaves.<br />
I think this is simply &#8217;4. Read Slaves&#8217; from the database patterns talk above. Again, the speaker pointed out that this approach doesn&#8217;t scale  well when you have too many writes because the he &#8216;Write master&#8217; becomes the single point of failure.</p>
<h4>Homemade Sharding</h4>
<p>So, another alternative is to use homemade sharding.<br />
&#8216;Homemade&#8217; because your app contains logic of how/where to put data.<br />
However there are 2 problems with this homemade approach.<br />
The first is that your application logic can become very complex.<br />
The second is that if the sharding logic is in your application, the approach doesn&#8217;t work if data is accessed via a different application!</p>
<h4>Sharding via Spider for MySQL</h4>
<p>So, instead of using a homemade approach, the speaker suggested using <a href="http://spiderformysql.com/">Spider</a> for MySQL.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.shaunabram.com/oscon-day3-database-scalability/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>OSCON Day1: Test Driven Database Development</title>
		<link>http://www.shaunabram.com/oscon-day1-tddd/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=oscon-day1-tddd</link>
		<comments>http://www.shaunabram.com/oscon-day1-tddd/#comments</comments>
		<pubDate>Wed, 21 Jul 2010 01:57:11 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[Testing]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[oscon]]></category>
		<category><![CDATA[postgres]]></category>
		<category><![CDATA[tdd]]></category>
		<category><![CDATA[tddd]]></category>

		<guid isPermaLink="false">http://www.shaunabram.com/?p=871</guid>
		<description><![CDATA[The first tutorial at OSCON was on Test Driven Database Development. The idea was to use pgTAP to write unit tests to check database correctness, including table structures, views and stored procedures. As a fan of Test Driven Development (TDD) for regular code, the concept of using it on the database tier makes a lot [...]]]></description>
			<content:encoded><![CDATA[<p>The first tutorial at <a href="http://www.oscon.com/oscon2010">OSCON</a> was on <a href="http://www.oscon.com/oscon2010/public/schedule/detail/14168">Test Driven Database Development</a>. The idea was to use <a href="http://pgtap.org/">pgTAP</a> to write unit tests to check database correctness, including table structures, views and stored procedures. As a fan of Test Driven Development (TDD) for regular code, the concept of using it on the database tier makes a lot of sense.</p>
<p>Unfortunately I had a lot of problems getting the required software setup working, which included <a href="http://www.postgresql.org/">PostgreSQL</a>, <a href="http://pgtap.org/">pgTAP</a>, <a href="http://search.cpan.org/~andya/Test-Harness-3.21/lib/Test/Harness.pm">Test::Harness</a>, <a href="http://www.gnu.org/software/make/">make</a> and <a href="http://www.perl.org/">perl</a>. Ultimately I wasn&#8217;t able to get the examples running due to imcompatabilities between PostgreSQL and pgTAP on my Macbook Pro (OS X 10.5.8) and ended with this error:</p>
<p><code>dyld: Library not loaded: /usr/local/lib/libxml2.2.dylib<br />
  Referenced from: /Library/PostgreSQL/8.4/lib/postgresql/pgxs/src/makefiles/../../src/test/regress/pg_regress<br />
  Reason: Incompatible library version: pg_regress requires version 10.0.0 or later, but libxml2.2.dylib provides version 9.0.0</code></p>
<p>I considered trying to upgrade libxml, but there were <a href="http://superuser.com/questions/132177/os-x-not-booting-after-upgrading-libxml2">suggestions</a> that this could cause my machine to not boot! I even considered upgrading to OS X 10.6 (Snow Leopard), but decided that this was a little too close to shaving yaks.</p>
<p>I would really like to get more familiar with pgTAP at some point, but I will have to put on hold for now&#8230;</p>
<p>Update: I managed to get some input from <a href="http://www.oscon.com/oscon2010/public/schedule/speaker/6582">David Wheeler</a>, worked through the technical issues and got all the tests running. Thanks David! Despite the earlier setup problems, I came away with a very positive feeling about TDDD and pgTAP and can see it playing a part in any future database schema development I do.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.shaunabram.com/oscon-day1-tddd/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

