Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSHARP-5442: Atlas Search doesn't respect the correct serializer #1583

Open
wants to merge 25 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions src/MongoDB.Driver/FieldValueSerializerHelper.cs
Original file line number Diff line number Diff line change
Expand Up @@ -317,6 +317,8 @@ internal class IEnumerableSerializer<TItem> : SerializerBase<IEnumerable<TItem>>
{
private readonly IBsonSerializer<TItem> _itemSerializer;

public IBsonSerializer<TItem> ItemSerializer => _itemSerializer;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding this new public property have the class implement IBsonArraySerializer and implement the TryGetItemSerializationIinfo method:

public bool TryGetItemSerializationInfo(out BsonSerializationInfo serialization
{
    serializationInfo = new(elementName: null, _itemSerializer, typeof(TItem));
    return true;                                                               
}                                                                              

All serializers for array-like values should implement IBsonArraySerializer. I don't know why this class did not already implement it.

public IEnumerableSerializer(IBsonSerializer<TItem> itemSerializer)
{
_itemSerializer = itemSerializer;
Expand Down
1 change: 0 additions & 1 deletion src/MongoDB.Driver/IAggregateFluent.cs
Original file line number Diff line number Diff line change
Expand Up @@ -404,7 +404,6 @@ IAggregateFluent<TNewResult> Lookup<TForeignDocument, TAsElement, TAs, TNewResul
IAggregateFluent<BsonDocument> SetWindowFields<TWindowFields>(
AggregateExpressionDefinition<ISetWindowFieldsPartition<TResult>, TWindowFields> output);

//TODO If I add a parameter here, then this would be a binary breaking change
/// <summary>
/// Appends a $search stage to the pipeline.
/// </summary>
Expand Down
107 changes: 79 additions & 28 deletions src/MongoDB.Driver/Search/OperatorSearchDefinitions.cs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
using System.Collections.Generic;
using System.Linq;
using MongoDB.Bson;
using MongoDB.Bson.IO;
using MongoDB.Bson.Serialization;
using MongoDB.Driver.Core.Misc;
using MongoDB.Driver.GeoJsonObjectModel;
Expand Down Expand Up @@ -118,38 +119,88 @@ private protected override BsonDocument RenderArguments(RenderArgs<TDocument> ar

internal sealed class EqualsSearchDefinition<TDocument, TField> : OperatorSearchDefinition<TDocument>
{
private readonly BsonValue _value;
private readonly TField _value;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename TField to TValue because that is what it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TField is the type of the field, that is also the type of the value, right?
Also if we do it here we'd need to make this change also in the other operators, and probably should propagate it as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TField and TValue are only the same when the field is not an array.

In the case of an array TField is IEnumerable<TValue>.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you remove the code you had earlier for _arrayField?

private readonly FieldDefinition<TDocument, TField> _field;
private readonly FieldDefinition<TDocument, IEnumerable<TField>> _arrayField;

private readonly List<Type> _validTypes =
[
typeof(bool),
typeof(sbyte),
typeof(byte),
typeof(short),
typeof(ushort),
typeof(int),
typeof(uint),
typeof(long),
typeof(float),
typeof(double),
typeof(DateTime),
typeof(DateTimeOffset),
typeof(ObjectId),
typeof(Guid),
typeof(string)
];

public EqualsSearchDefinition(FieldDefinition<TDocument, TField> path, TField value, SearchScoreDefinition<TDocument> score)
: base(OperatorType.Equals, new SingleSearchPathDefinition<TDocument>(path), score)
{
ValidateType();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I question whether we need to have ValidateType at all. Looks to me like this was just a client-side limitation related to what .NET types ToBsonValue knew how to convert to BSON, but now that we are using the proper serializer this limitation should no longer exist.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked with Boris and he confirmed that it is a server limitation which types are supported.

But even so, we probably shouldn't be checking this client side. We should just send the value to the server and let it return an error if it doesn't support the data type.

That way our driver will work well with all supported versions of the server, which might have different limitations. And if future server versions support new value types we won't have to make any changes.

Copy link
Contributor Author

@papafe papafe Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also like this idea. I tried to keep it consistent with what was there before, but I agree that moving this check server side would allow us to be more free in what we do and don't support over time.
I need to remember to do the same with the other operators too.

_value = value;
_field = path;
}

public EqualsSearchDefinition(FieldDefinition<TDocument> path, TField value, SearchScoreDefinition<TDocument> score)
: base(OperatorType.Equals, path, score)
public EqualsSearchDefinition(FieldDefinition<TDocument, IEnumerable<TField>> path, TField value, SearchScoreDefinition<TDocument> score)
: base(OperatorType.Equals, new SingleSearchPathDefinition<TDocument>(path), score)
{
_value = ToBsonValue(value);
ValidateType();
_value = value;
_arrayField = path;
}

private protected override BsonDocument RenderArguments(RenderArgs<TDocument> args) =>
new("value", _value);
private protected override BsonDocument RenderArguments(RenderArgs<TDocument> args)
{
if (_field is null)
{
var renderedField = _arrayField.Render(args);

var serializer =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incredibly ugly, but was just testing if it worked

Copy link
Contributor Author

@papafe papafe Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rstam this is what I was referring to.
Do you think it would be worth adding a new property to RenderArgs so that we can get directly the serializer we need (similar to RenderForElemMatch for instance)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is on the right track but can use a little tweaking to how the right serializer is found and to reduce duplication between how scalar and array fields are handled.

Something like this:

private protected override BsonDocument RenderArguments(RenderArgs<TDocument> args)
{
    IBsonSerializer<TValue> valueSerializer;
    if (_field != null)
    {
        var renderedField = _field.Render(args);
        valueSerializer = renderedField.FieldSerializer;
    }
    else
    {
        var renderedArrayField = _arrayField.Render(args);
        valueSerializer = (IBsonSerializer<TValue>)ArraySerializerHelper.GetItemSerializer(renderedArrayField.FieldSerializer);
    }                                                                                                                          
                                                                                                                               
    var document = new BsonDocument();                                                                                         
    using var bsonWriter = new BsonDocumentWriter(document);                                                                   
    var context = BsonSerializationContext.CreateRoot(bsonWriter);                                                             
    bsonWriter.WriteStartDocument();                                                                                           
    bsonWriter.WriteName("value");                                                                                             
    valueSerializer.Serialize(context, _value);                                                                                
    bsonWriter.WriteEndDocument();                                                                                             
                                                                                                                               
    return document;                                                                                                           
}                                                                                                                              

Copy link
Contributor Author

@papafe papafe Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, this solves my doubts. That makes sense, I was mostly focusing on how to get the correct serializer before trying to refactor.

Regarding IBsonArraySerializer, the majority of times when we implement that interface we also implement IChildSerializerConfigurable, and that makes sense. It also seems that for this use case the result we'd get would be the same, so what would you think about using IChildSerializerConfigurable instead?
To be honest this is more of a generic question, I'm curious about if there's any preference on using one over the other, of course considering that TryGetItemSerializationInfo can potentially return more info (not in this case though).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IBsonArraySerializer and IChildSerializerConfigurable are orthogonal. We can't just implement IChildSerializerConfigurable instead of IBsonArraySerializer.

In this case we must implement IBsonArraySerializer.

It is true that in general whenever implementing IBsonArraySerializer we would likely also want to implement IChildSerializerConfigurable. However, that applies mostly to public serializers that a user might want to configure. This is an internal serializer, so I think implementing IBsonArraySerializer is sufficient.

(renderedField.ValueSerializer as FieldValueSerializerHelper.IEnumerableSerializer<TField>)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of trying to cast to the very specific class FieldValueSerializerHelper.IEnumerableSerializer<TField> use the more general from (used everywhere else) to get the item serializer from an array-like serializer:

valueSerializer = (IBsonSerializer<TValue>)ArraySerializerHelper.GetItemSerializer(renderedArrayField.FieldSerializer);

This is why FieldValueSerializerHelper.IEnumerableSerializer<TItem> should implement IBsonArraySerializer (as all array-like serializers should). Not sure why it didn't already...

.ItemSerializer;

var document = new BsonDocument();
using var bsonWriter = new BsonDocumentWriter(document);
var context = BsonSerializationContext.CreateRoot(bsonWriter);
bsonWriter.WriteStartDocument();
bsonWriter.WriteName("value");
serializer.Serialize(context, _value);
bsonWriter.WriteEndDocument();

return document;
}
else
{
var renderedField = _field.Render(args);

var document = new BsonDocument();
using var bsonWriter = new BsonDocumentWriter(document);
var context = BsonSerializationContext.CreateRoot(bsonWriter);
bsonWriter.WriteStartDocument();
bsonWriter.WriteName("value");
renderedField.ValueSerializer.Serialize(context, _value);
bsonWriter.WriteEndDocument();

return document;
}
}

private static BsonValue ToBsonValue(TField value) =>
value switch
private void ValidateType()
{
if (!_validTypes.Contains(typeof(TField)))
{
bool v => (BsonBoolean)v,
sbyte v => (BsonInt32)v,
byte v => (BsonInt32)v,
short v => (BsonInt32)v,
ushort v => (BsonInt32)v,
int v => (BsonInt32)v,
uint v => (BsonInt64)v,
long v => (BsonInt64)v,
float v => (BsonDouble)v,
double v => (BsonDouble)v,
DateTime v => (BsonDateTime)v,
DateTimeOffset v => (BsonDateTime)v.UtcDateTime,
ObjectId v => (BsonObjectId)v,
Guid v => new BsonBinaryData(v, GuidRepresentation.Standard),
string v => (BsonString)v,
null => BsonNull.Value,
_ => throw new InvalidCastException()
};
throw new InvalidCastException();
}
}
}

internal sealed class ExistsSearchDefinition<TDocument> : OperatorSearchDefinition<TDocument>
Expand Down Expand Up @@ -243,7 +294,7 @@ public InSearchDefinition(
private protected override BsonDocument RenderArguments(RenderArgs<TDocument> args) =>
new("value", _values);

private static BsonValue ToBsonValue(TField value) =>
private static BsonValue ToBsonValue(TField value) => //TODO probably we need to change also this
value switch
{
bool v => (BsonBoolean)v,
Expand Down Expand Up @@ -382,7 +433,7 @@ private protected override BsonDocument RenderArguments(RenderArgs<TDocument> ar
{ _range.IsMaxInclusive ? "lte" : "lt", _max, _max != null },
};

private static BsonValue ToBsonValue(TField? value) =>
private static BsonValue ToBsonValue(TField? value) => //TODO Probably we need to change this too
value switch
{
sbyte v => (BsonInt32)v,
Expand Down
4 changes: 2 additions & 2 deletions src/MongoDB.Driver/Search/SearchDefinition.cs
Original file line number Diff line number Diff line change
Expand Up @@ -123,9 +123,9 @@ private protected enum OperatorType
QueryString,
Range,
Regex,
Search,
Search, //TODO This is never used
Span,
Term,
Term, //TODO This is never used
Text,
Wildcard
}
Expand Down
93 changes: 93 additions & 0 deletions tests/MongoDB.Driver.Tests/Jira/CSharp5442Tests.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
/* Copyright 2010-present MongoDB Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

using System;
using FluentAssertions;
using MongoDB.Bson;
using MongoDB.Bson.Serialization;
using MongoDB.Bson.Serialization.Attributes;
using MongoDB.Bson.Serialization.Serializers;
using Moq;
using Xunit;

namespace MongoDB.Driver.Tests.Jira
{
public class CSharp5442Tests
{
[Fact]
public void Search_Operators_use_correct_serializers_when_using_attributes_and_expression_path()
{
var testGuid = Guid.Parse("01020304-0506-0708-090a-0b0c0d0e0f10");
var collection = new Mock<IMongoCollection<TestClass>>().Object;
var searchDefinition = Builders<TestClass>
.Search
.Compound()
.Must(Builders<TestClass>.Search.Equals(t => t.DefaultGuid, testGuid))
.Must(Builders<TestClass>.Search.Equals(t => t.StringGuid, testGuid));

var result = collection.Aggregate().Search(searchDefinition).ToString();

const string expected = """aggregate([{ "$search" : { "compound" : { "must" : [{ "equals" : { "value" : { "$binary" : { "base64" : "AQIDBAUGBwgJCgsMDQ4PEA==", "subType" : "04" } }, "path" : "DefaultGuid" } }, { "equals" : { "value" : "01020304-0506-0708-090a-0b0c0d0e0f10", "path" : "StringGuid" } }] } } }])""";

result.Should().Be(expected);
}

[Fact]
public void Search_Operators_use_correct_serializers_when_using_attributes_and_string_path()
{
var testGuid = Guid.Parse("01020304-0506-0708-090a-0b0c0d0e0f10");
var collection = new Mock<IMongoCollection<TestClass>>().Object;
var searchDefinition = Builders<TestClass>
.Search
.Compound()
.Must(Builders<TestClass>.Search.Equals("DefaultGuid", testGuid))
.Must(Builders<TestClass>.Search.Equals("StringGuid", testGuid));

var result = collection.Aggregate().Search(searchDefinition).ToString();

const string expected = """aggregate([{ "$search" : { "compound" : { "must" : [{ "equals" : { "value" : { "$binary" : { "base64" : "AQIDBAUGBwgJCgsMDQ4PEA==", "subType" : "04" } }, "path" : "DefaultGuid" } }, { "equals" : { "value" : "01020304-0506-0708-090a-0b0c0d0e0f10", "path" : "StringGuid" } }] } } }])""";

result.Should().Be(expected);
}

[Fact(Skip = "This should only be run manually due to the use of BsonSerializer.RegisterSerializer")] //TODO Put back skip afterwards
public void Search_Operators_use_correct_serializers_when_using_serializer_registry()
{
BsonSerializer.RegisterSerializer(new GuidSerializer(BsonType.String));
var testGuid = Guid.Parse("01020304-0506-0708-090a-0b0c0d0e0f10");
var collection = new Mock<IMongoCollection<TestClass>>().Object;
var searchDefinition = Builders<TestClass>
.Search
.Equals(t => t.UndefinedRepresentationGuid, testGuid);

var result = collection.Aggregate().Search(searchDefinition).ToString();

const string expected = """aggregate([{ "$search" : { "equals" : { "value" : "01020304-0506-0708-090a-0b0c0d0e0f10", "path" : "UndefinedRepresentationGuid" } } }])""";

result.Should().Be(expected);
}

public class TestClass
{
[BsonGuidRepresentation(GuidRepresentation.Standard)]
public Guid DefaultGuid { get; set; }

[BsonRepresentation(BsonType.String)]
public Guid StringGuid { get; set; }

public Guid UndefinedRepresentationGuid { get; set; }
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -276,7 +276,7 @@ public void Equals_with_array_should_render_supported_type()
}

[Theory]
[MemberData(nameof(EqualsSupportedTypesTestData))]
[MemberData(nameof(EqualsSupportedTypesTestData))] //TODO Need to fix this
public void Equals_should_render_supported_type<T>(
T value,
string valueRendered,
Expand All @@ -290,6 +290,8 @@ public void Equals_should_render_supported_type<T>(
subject.Equals("x", value),
$"{{ equals: {{ path: 'x', value: {valueRendered} }} }}");

//For the Guid we get (correctly): GuidSerializer cannot serialize a Guid when GuidRepresentation is Unspecified.
Copy link
Contributor Author

@papafe papafe Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rstam This test here is now failing because now we're using the default serializer and that doesn't have the GuidRepresentation specified.
Do you have any suggestion on how to fix this? The exception makes sense, and normally we'd probably suggest statically registering a serializer, but this would impact all the other tests here.
Should we ignore the Guid field for this test and make a separable test here that can only be run manually?
This would need to be done also for other operators.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because on line 292 this testuses the string form of the field name and specified "x" and there is no field called 'x" to pick up the correctly configured GuidSerializer from.

I would simply skip this part of the test for Guid:

// in this test there is no property called "x" to pick up a properly configured GuidSerializer from
if (typeof(T) != typeof(Guid))
{
    AssertRendered(
        subject.Equals("x", value),
        $"{{ equals: {{ path: 'x', value: {valueRendered} }} }}");
    var scoreBuilder = new SearchScoreDefinitionBuilder<BsonDocument>();
                                                                        
    AssertRendered(
        subject.Equals("x", value, scoreBuilder.Constant(1)),
        $"{{ equals: {{ path: 'x', value: {valueRendered}, score: {{ constant: {{ value: 1 }} }} }} }}");
}                                                                                                        

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would have loved to avoid making special cases, but it seems we need to. I'll make some additional tests that can be run manually.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with just skipping this test for Guids, because:

  1. This test is against IMongoCollection<BsonDocument> and that is probably not used often
  2. Anyone using BsonDocument should either not be using Guid or should take control of the conversion themselves

A manual test would involve registering a properly configured global GuidSerializer, but we already know that works (in part because the strongly typed version uses a properly configured GuidSerializer, though it's true it doesn't get it from the global registry).


var scoreBuilder = new SearchScoreDefinitionBuilder<BsonDocument>();
AssertRendered(
subject.Equals("x", value, scoreBuilder.Constant(1)),
Expand All @@ -313,7 +315,7 @@ public void Equals_should_render_supported_type<T>(
new object[] { (float)1, "1", Exp(p => p.Float), nameof(Person.Float) },
new object[] { (double)1, "1", Exp(p => p.Double), nameof(Person.Double) },
new object[] { DateTime.MinValue, "ISODate(\"0001-01-01T00:00:00Z\")", Exp(p => p.Birthday), "dob" },
new object[] { DateTimeOffset.MaxValue, "ISODate(\"9999-12-31T23:59:59.999Z\")", Exp(p => p.DateTimeOffset), nameof(Person.DateTimeOffset) },
new object[] { DateTimeOffset.MaxValue, """{ "DateTime" : { "$date" : "9999-12-31T23:59:59.999Z" }, "Ticks" : 3155378975999999999, "Offset" : 0 }""", Exp(p => p.DateTimeOffset), nameof(Person.DateTimeOffset) },
new object[] { ObjectId.Empty, "{ $oid: '000000000000000000000000' }", Exp(p => p.Id), "_id" },
new object[] { Guid.Empty, """{ "$binary" : { "base64" : "AAAAAAAAAAAAAAAAAAAAAA==", "subType" : "04" } }""", Exp(p => p.Guid), nameof(Person.Guid) },
new object[] { null, "null", Exp(p => p.Name), nameof(Person.Name) },
Expand Down