Tom DuPont .NET: RavenDB

Showing posts with label RavenDB. Show all posts

Tuesday, January 31, 2017

.NET Standard Adoption as of January 2017

Updated 2/16 to include Elasticsearch

As should be obviously from my recently blog posts, I have really been enjoying working with .NET Core. Clearly I am not alone, as a significant number of libraries have been porting over to the .NET Standard.

Below is a list libraries that have added support for the .NET Standard, meaning that they should be able to run cross platform on both Windows and Linux.

While I have not yet had the opportunity to try all of the libraries listed below, I have had great luck with the ones that I have tested, and I am simply ecstatic to see this list growing as fast as it is.

Technology	NuGet Package	.NET Standard Support
Autofac	Autofac	Released for 1.1
Cassandra	DataStax C# Driver for Apache Cassandra	Released for 1.5
Couchbase	Couchbase SDK 2.0	Beta for 1.5
Elasticsearch	Elasticsearch.Net	Released for 1.3
Kafka	Confluent.Kafka	Preview for 1.3
log4net	Apache log4net	Released for 1.3
MongoDB	MongoDB.Driver	Released for 1.4
NLog	NLog	Beta for 1.3
RabbitMQ	RabbitMQ.Client	Released for 1.5
RavenDB	RavenDB Client	Released for 1.3
Redis	StackExchange.Redis	Released for 1.5
Sqlite	Microsoft.EntityFrameworkCore.Sqlite	Released for 1.3
WebSocket Client	WebSocket4Net	Released for 1.3

How have these libraries been working out for you? Is there a better option than what I have listed? Please leave a comment and let me know!

Enjoy,
Tom

Sunday, July 27, 2014

RavenDB 2.5 vs 3.0 Write Performance (so far)

Is Voron out performing Esent in RavenDB 3.0?

...not yet, at least in terms of write speed. I ran a few tests on my home machine to compare the write and indexing speeds of Raven 2.5 (build 2908) against Raven 3.0 (build 3358). Unfortunately the results were not encouraging. However it is worth pointing out that the Raven team did save their performance updates for last when releasing Raven 2.5, so I do expect that this will improve before we see an RC.

Test Results

Here are the results of my (little) performance tests:

Document Count	RavenDB 2.5		RavenDB 3.0		Difference
Document Count	Elapsed Import Time	Elapsed Index Time	Elapsed Import Time	Elapsed Index Time	Import Percent	Index Percent
0 - 100k	0:57.48	1:41.45	1:08.59	1:25.39	-19.33%	15.82%
100k - 200k	1:02.68	1:34.85	1:10.87	1:35.65	-13.08%	-0.84%
200k - 300k	1:00.34	2:17.84	1:12.94	1:47.20	-20.89%	22.22%
300k - 400k	1:00.85	1:38.59	1:13.46	1:45.61	-20.73%	-7.12%
400k - 500k	1:02.03	1:38.70	1:12.03	1:58.51	-16.12%	-20.07%

Use RavenDB to power Data Driven xUnit Theories

I love xUnit's data driven unit tests, I also really enjoy working with RavenDB, and now I can use them together!

Data driven unit tests are very powerful tools that allow you to execute the same test code against multiple data sets. Testing frameworks such as xUnit makes this extremely easy to develop by offering an out of the box set attributes to quickly and easily annotate your test methods with dynamic data sources.

Below is some simple code that adds a RavenDataAttribute to xUnit. This attribute will pull arguments from a document database and pass them into your unit test, using the fully qualified method name as a key.

Example Unit Tests

public class RavenDataTests

    [Theory]

    [RavenData]

    public void PrimitiveArgs(int number, bool isDivisibleBytwo)

        var remainder = number % 2;

        Assert.Equal(isDivisibleBytwo, remainder == 0);

    [Theory]

    [RavenData]

    public void ComplexArgs(ComplexArgsModel model)

        var remainder = model.Number % 2;

        Assert.Equal(model.IsDivisibleByTwo, remainder == 0);

    [Fact(Skip = "Only run once for setup")]

    public void Setup()

        var type = typeof(RavenDataTests);

        var primitiveArgsMethod = type.GetMethod("PrimitiveArgs");

        var primitiveArgs = new object[] { 3, false };

        RavenDataAttribute.SaveData(primitiveArgsMethod, primitiveArgs);

        var complexArgsMethod = type.GetMethod("ComplexArgs");

        var complexArgsModel = new ComplexArgsModel

            IsDivisibleByTwo = true,

            Number = 4

};

        RavenDataAttribute.SaveData(complexArgsMethod, complexArgsModel);

    public class ComplexArgsModel

        public int Number { get; set; }

        public bool IsDivisibleByTwo { get; set; }

Last in Win Replication for RavenDB

One of my favorite features of RavenDB is how easy it is customize and extend.

RavenDB offers an extremely easy to use built in replication bundle. To deal with replication conflicts, the RavenDB.Database NuGet Package includes an abstract base class (the AbstractDocumentReplicationConflictResolver) that you can implement with your own conflict resolution rules.

Last In Wins Replication Conflict Resolver

John Bennett wrote a LastInWinsReplicationConflictResolver for RavenDB 1.0, and I have updated it for RavenDB 2.0 and 2.5. As always you can get that code from GitHub!

Download RavenExtensions from GitHub

Once you have built your resolver, you need only drop the assembly into the Plugins folder at the root of your RavenDB server and it will automatically be detected and loaded the next time that your server starts.

public class LastInWinsReplicationConflictResolver

    : AbstractDocumentReplicationConflictResolver

    private readonly ILog _log = LogManager.GetCurrentClassLogger();

    public override bool TryResolve(

        string id,

        RavenJObject metadata,

        RavenJObject document,

        JsonDocument existingDoc,

        Func<string, JsonDocument> getDocument)

        if (ExistingDocShouldWin(metadata, existingDoc))

            ReplaceValues(metadata, existingDoc.Metadata);

            ReplaceValues(document, existingDoc.DataAsJson);

            _log.Debug(

                "Replication conflict for '{0}' resolved with existing doc",

                id);

        else

            _log.Debug(

                "Replication conflict for '{0}' resolved with inbound doc",

                id);

        return true;

    private static bool ExistingDocShouldWin(

        RavenJObject newMetadata,

        JsonDocument existingDoc)

        if (existingDoc == null ||

            ExistingDocHasConflict(existingDoc) ||

            ExistingDocIsOlder(newMetadata, existingDoc))

            return false;

        return true;

    private static bool ExistingDocHasConflict(JsonDocument existingDoc)

        return existingDoc.Metadata[Constants.RavenReplicationConflict] != null;

    private static bool ExistingDocIsOlder(

        RavenJObject newMetadata,

        JsonDocument existingDoc)

        var newLastModified = GetLastModified(newMetadata);

        if (!existingDoc.LastModified.HasValue ||

            newLastModified.HasValue &&

            existingDoc.LastModified <= newLastModified)

            return true;

        return false;

    private static DateTime? GetLastModified(RavenJObject metadata)

        var lastModified = metadata[Constants.LastModified];

        return (lastModified == null)

            ? new DateTime?()

            : lastModified.Value<DateTime?>();

    private static void ReplaceValues(RavenJObject target, RavenJObject source)

        var targetKeys = target.Keys.ToArray();

        foreach (var key in targetKeys)

            target.Remove(key);

        foreach (var key in source.Keys)

            target.Add(key, source[key]);

Enjoy,
Tom

Sunday, July 7, 2013

Lucene.Net Analyzer Viewer for RavenDB

To query your data in RavenDB you need to write queries in Lucene.Net.

To know which documents your queries are going to return means that you need to know exactly how your query is being parsed by Lucene.Net. Full text analysis is a great baked in feature of RavenDB, but I have found that they Lucene.NET standard analyzer that parses full text fields can sometimes return surprising results.

This Lucene.Net 3.0 Analyzer Viewer is an update of Andrew Smith's original version for Lucene.Net 2.0. This update now allows you to view the results of text analysis for the same version of Lucene that RavenDB is using. This simple tool can be invaluable to debugging full text searches in RavenDB!

Download Raven.Extensions.AnalyzerViewer from GitHub

This tool also comes with my Alphanumeric Analyzer built in.

Enjoy,
Tom

Monday, May 27, 2013

How to Test RavenDB Indexes

What if you could spin up an entire database in memory for every unit test? You can!

RavenDB offers an EmbeddableDocumentStore NuGet Package that allows you to create a complete in memory instance of RavenDB. This makes writing integration tests for your custom Indexes extremely easy.

The Hibernating Rhinos team makes full use of this feature by including a full suite of unit tests in their RavenDB solution. They even encourage people to submit pull requests to their GitHub repository so that they can pull those tests directly into their source. This is a BRILLIANT integration of all these technologies to both encourage testing and provide an extremely stable product.

So then, how do you test your RavenDB indexes? Great question; let's get into the code!

Define your Document

public class Doc

    public int InternalId { get; set; }

    public string Text { get; set; }

Define your Index

public class Index : AbstractIndexCreationTask<Doc>

    public Index()

        Map = docs => from doc in docs

                      select new

                          doc.InternalId,

                          doc.Text

};

        Analyzers.Add(d => d.Text, "Raven.Extensions.AlphanumericAnalyzer");

Create your EmbeddableDocumentStore

Insert your Index and Documents

In this example I am creating an abstract base class for my unit tests. The GetDocumentStore method provides an EmbeddableDocumentStore that comes pre-initialized with the default RavenDocumentsByEntityName index, your custom index, and a complete set of documents that have already been inserted. The Documents come from an abstract Documents property, which we will see implemented below in step 5.

protected abstract ICollection<TDoc> Documents { get; }

private EmbeddableDocumentStore NewDocumentStore()

    var documentStore = new EmbeddableDocumentStore

        Configuration =

            RunInUnreliableYetFastModeThatIsNotSuitableForProduction = true,

            RunInMemory = true

};

    documentStore.Initialize();

    // Create Default Index

    var defaultIndex = new RavenDocumentsByEntityName();

    defaultIndex.Execute(documentStore);

    // Create Custom Index

    var customIndex = new TIndex();

    customIndex.Execute(documentStore);

    // Insert Documents from Abstract Property

    using (var bulkInsert = documentStore.BulkInsert())

        foreach (var document in Documents)

            bulkInsert.Store(document);

    return documentStore;

Write your Tests

These tests are testing a custom Alphanumeric analyzer. They will take in a series of lucene queries and assert that they match the correct internal Ids. These documents are being defined by our abstract Documents property from Step 4.

NOTE: Do not forget to include the WaitForNonStaleResults method on your queries, as your index may not be done building the first time you run your tests.

[Theory]

[InlineData(@"Text:Hello",              new[] {0})]

[InlineData(@"Text:file_name",          new[] {2})]

[InlineData(@"Text:name*",              new[] {2, 3})]

[InlineData(@"Text:my AND Text:txt",    new[] {2, 3})]

public void Query(string query, int[] expectedIds)

    int[] actualIds;

    using (var documentStore = NewDocumentStore())

    using (var session = documentStore.OpenSession())

        actualIds = session.Advanced

            .LuceneQuery<Doc>("Index")

            .Where(query)

            .SelectFields<int>("InternalId")

            .WaitForNonStaleResults()

            .ToArray();

    Assert.Equal(expectedIds, actualIds);

protected override ICollection<Doc> Documents

get

        return new[]

                "Hello, world!",

                "Goodnight...moon?",

                "my_file_name_01.txt",

                "my_file_name01.txt"

            .Select((t, i) => new Doc

                InternalId = i,

                Text = t

})

            .ToArray();

Enjoy,
Tom

Sunday, May 12, 2013

Alphanumeric Lucene Analyzer for RavenDB

RavenDB's full text indexing uses Lucene.Net

RavenDB is a second generation document database. This means that you can to throw typeless documents into a data store, but the only way to query them is by indexes that are built with Lucene.Net. RavenDB is a wonderful product that's primary strength is it's simplicity and easy of use. In keeping with that theme, even when you need to customize RavenDB, it makes it relatively easy to do.

So, let's talk about customizing your Lucene.Net analyzer in RavenDB!

Available Analyzers

RavenDB comes equipped with all of the analyzers that are built into Lucene.Net. For the vast majority of use cases, these will do the job! Here are some examples:

"The fox jumped over the lazy dogs, Bob@hotmail.com 123432."

StandardAnalyzer, which is Lucene's default, will produce the following tokens:
```
[fox] [jumped] [over] [lazy] [dog] [bob@hotmail.com] [123432]
```

SimpleAnalyzer will tokenize on all non-alpha characters, and will make all the tokens lowercase:
```
[the] [fox] [jumped] [over] [the] [lazy] [dogs] [bob] [hotmail] [com]
```

WhitespaceAnalyzer will just tokenize on white spaces:

[The] [fox] [jumped] [over] [the] [lazy] [dogs,] [Bob@hotmail.com]
[123432.]

In order to resolve an issue with indexing file names (details below), I found myself in need of an Alphanumeric analyzer. This analyzer would be similar to the SimpleAnalyzer, but would still respect numeric values.

AlphanumericAnalyzer will tokenize on the .NET framework's Char.IsDigitOrLetter:
```
[fox] [jumped] [over] [lazy] [dogs] [bob] [hotmail] [com] [123432]
```

Lucene.Net's base classes made this pretty easy to build...

How to Implement a Custom Analyzer

Grab all the code and more from GitHub:

Raven.Extensions.AlphanumericAnalyzer on GitHub

A lucene analyzer is made of two basic parts, 1) a tokenizer, and 2) a series of filters. The tokenizer does the lions share of the work and splits the input apart, then the filters run in succession making additional tweaks to the tokenized output.

To create the Alphanumeric Analyzer we need only create two classes, an analyzer and a tokenizer. After that the analyzer can use reuse the existing LowerCaseFilter and StopFilter classes.

AlphanumericAnalyzer

public sealed class AlphanumericAnalyzer : Analyzer

    public AlphanumericAnalyzer(Version matchVersion, ISet<string> stopWords)

        _enableStopPositionIncrements = StopFilter

            .GetEnablePositionIncrementsVersionDefault(matchVersion);

        _stopSet = stopWords;

    public override TokenStream TokenStream(String fieldName, TextReader reader)

        TokenStream tokenStream = new AlphanumericTokenizer(reader);

        tokenStream = new LowerCaseFilter(tokenStream);

        tokenStream = new StopFilter(

            _enableStopPositionIncrements,

            tokenStream,

            _stopSet);

        return tokenStream;

AlphanumericTokenizer

public class AlphanumericTokenizer : CharTokenizer

    protected override bool IsTokenChar(char c)

        return Char.IsLetterOrDigit(c);

How to Install Plugins in RavenDB

Installing a custom plugin to RavenDB is unbelievably easy. Just compile your assembly, and then drop it into the Plugins folder at the root of your RavenDB server. You may then reference the analyzers in your indexes by using their fully assembly qualified names.

Again, you can grab all of the code and more over on GitHub:

Raven.Extensions.AlphanumericAnalyzer on GitHub

Enjoy,
Tom

Tom DuPont .NET

Tuesday, January 31, 2017

.NET Standard Adoption as of January 2017

Sunday, July 27, 2014

RavenDB 2.5 vs 3.0 Write Performance (so far)

Test Results

Sunday, July 13, 2014

Use RavenDB to power Data Driven xUnit Theories

Example Unit Tests

Wednesday, August 7, 2013

Last in Win Replication for RavenDB

Last In Wins Replication Conflict Resolver

Sunday, July 7, 2013

Lucene.Net Analyzer Viewer for RavenDB

Monday, May 27, 2013

How to Test RavenDB Indexes

Sunday, May 12, 2013

Alphanumeric Lucene Analyzer for RavenDB