-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSHARP-5442: Atlas Search doesn't respect the correct serializer #1583
base: main
Are you sure you want to change the base?
Conversation
@BorisDog I've added you as a reviewer since it seems you were the one that worked on search, mostly. This is still a work in progress, I've broke some things in the process.
|
|
||
var renderedField = _arrayField.Render(args); | ||
|
||
var serializer = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incredibly ugly, but was just testing if it worked
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rstam this is what I was referring to.
Do you think it would be worth adding a new property to RenderArgs
so that we can get directly the serializer we need (similar to RenderForElemMatch
for instance)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is on the right track but can use a little tweaking to how the right serializer is found and to reduce duplication between how scalar and array fields are handled.
Something like this:
private protected override BsonDocument RenderArguments(RenderArgs<TDocument> args)
{
IBsonSerializer<TValue> valueSerializer;
if (_field != null)
{
var renderedField = _field.Render(args);
valueSerializer = renderedField.FieldSerializer;
}
else
{
var renderedArrayField = _arrayField.Render(args);
valueSerializer = (IBsonSerializer<TValue>)ArraySerializerHelper.GetItemSerializer(renderedArrayField.FieldSerializer);
}
var document = new BsonDocument();
using var bsonWriter = new BsonDocumentWriter(document);
var context = BsonSerializationContext.CreateRoot(bsonWriter);
bsonWriter.WriteStartDocument();
bsonWriter.WriteName("value");
valueSerializer.Serialize(context, _value);
bsonWriter.WriteEndDocument();
return document;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect, this solves my doubts. That makes sense, I was mostly focusing on how to get the correct serializer before trying to refactor.
Regarding IBsonArraySerializer
, the majority of times when we implement that interface we also implement IChildSerializerConfigurable
, and that makes sense. It also seems that for this use case the result we'd get would be the same, so what would you think about using IChildSerializerConfigurable
instead?
To be honest this is more of a generic question, I'm curious about if there's any preference on using one over the other, of course considering that TryGetItemSerializationInfo
can potentially return more info (not in this case though).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IBsonArraySerializer
and IChildSerializerConfigurable
are orthogonal. We can't just implement IChildSerializerConfigurable
instead of IBsonArraySerializer
.
In this case we must implement IBsonArraySerializer
.
It is true that in general whenever implementing IBsonArraySerializer
we would likely also want to implement IChildSerializerConfigurable
. However, that applies mostly to public
serializers that a user might want to configure. This is an internal
serializer, so I think implementing IBsonArraySerializer
is sufficient.
@@ -118,38 +119,88 @@ private protected override BsonDocument RenderArguments(RenderArgs<TDocument> ar | |||
|
|||
internal sealed class EqualsSearchDefinition<TDocument, TField> : OperatorSearchDefinition<TDocument> | |||
{ | |||
private readonly BsonValue _value; | |||
private readonly TField _value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename TField
to TValue
because that is what it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TField
is the type of the field, that is also the type of the value, right?
Also if we do it here we'd need to make this change also in the other operators, and probably should propagate it as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TField
and TValue
are only the same when the field is not an array.
In the case of an array TField
is IEnumerable<TValue>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you remove the code you had earlier for _arrayField
?
public EqualsSearchDefinition(FieldDefinition<TDocument, TField> path, TField value, SearchScoreDefinition<TDocument> score) | ||
: base(OperatorType.Equals, new SingleSearchPathDefinition<TDocument>(path), score) | ||
{ | ||
ValidateType(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I question whether we need to have ValidateType
at all. Looks to me like this was just a client-side limitation related to what .NET types ToBsonValue
knew how to convert to BSON, but now that we are using the proper serializer this limitation should no longer exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked with Boris and he confirmed that it is a server limitation which types are supported.
But even so, we probably shouldn't be checking this client side. We should just send the value to the server and let it return an error if it doesn't support the data type.
That way our driver will work well with all supported versions of the server, which might have different limitations. And if future server versions support new value types we won't have to make any changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also like this idea. I tried to keep it consistent with what was there before, but I agree that moving this check server side would allow us to be more free in what we do and don't support over time.
I need to remember to do the same with the other operators too.
|
||
var renderedField = _arrayField.Render(args); | ||
|
||
var serializer = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is on the right track but can use a little tweaking to how the right serializer is found and to reduce duplication between how scalar and array fields are handled.
Something like this:
private protected override BsonDocument RenderArguments(RenderArgs<TDocument> args)
{
IBsonSerializer<TValue> valueSerializer;
if (_field != null)
{
var renderedField = _field.Render(args);
valueSerializer = renderedField.FieldSerializer;
}
else
{
var renderedArrayField = _arrayField.Render(args);
valueSerializer = (IBsonSerializer<TValue>)ArraySerializerHelper.GetItemSerializer(renderedArrayField.FieldSerializer);
}
var document = new BsonDocument();
using var bsonWriter = new BsonDocumentWriter(document);
var context = BsonSerializationContext.CreateRoot(bsonWriter);
bsonWriter.WriteStartDocument();
bsonWriter.WriteName("value");
valueSerializer.Serialize(context, _value);
bsonWriter.WriteEndDocument();
return document;
}
@@ -317,6 +317,8 @@ internal class IEnumerableSerializer<TItem> : SerializerBase<IEnumerable<TItem>> | |||
{ | |||
private readonly IBsonSerializer<TItem> _itemSerializer; | |||
|
|||
public IBsonSerializer<TItem> ItemSerializer => _itemSerializer; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding this new public property have the class implement IBsonArraySerializer
and implement the TryGetItemSerializationIinfo
method:
public bool TryGetItemSerializationInfo(out BsonSerializationInfo serialization
{
serializationInfo = new(elementName: null, _itemSerializer, typeof(TItem));
return true;
}
All serializers for array-like values should implement IBsonArraySerializer
. I don't know why this class did not already implement it.
var renderedField = _arrayField.Render(args); | ||
|
||
var serializer = | ||
(renderedField.ValueSerializer as FieldValueSerializerHelper.IEnumerableSerializer<TField>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of trying to cast to the very specific class FieldValueSerializerHelper.IEnumerableSerializer<TField>
use the more general from (used everywhere else) to get the item serializer from an array-like serializer:
valueSerializer = (IBsonSerializer<TValue>)ArraySerializerHelper.GetItemSerializer(renderedArrayField.FieldSerializer);
This is why FieldValueSerializerHelper.IEnumerableSerializer<TItem>
should implement IBsonArraySerializer
(as all array-like serializers should). Not sure why it didn't already...
public EqualsSearchDefinition(FieldDefinition<TDocument, TField> path, TField value, SearchScoreDefinition<TDocument> score) | ||
: base(OperatorType.Equals, new SingleSearchPathDefinition<TDocument>(path), score) | ||
{ | ||
ValidateType(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked with Boris and he confirmed that it is a server limitation which types are supported.
But even so, we probably shouldn't be checking this client side. We should just send the value to the server and let it return an error if it doesn't support the data type.
That way our driver will work well with all supported versions of the server, which might have different limitations. And if future server versions support new value types we won't have to make any changes.
@@ -290,6 +290,8 @@ public void Equals_should_render_supported_type<T>( | |||
subject.Equals("x", value), | |||
$"{{ equals: {{ path: 'x', value: {valueRendered} }} }}"); | |||
|
|||
//For the Guid we get (correctly): GuidSerializer cannot serialize a Guid when GuidRepresentation is Unspecified. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rstam This test here is now failing because now we're using the default serializer and that doesn't have the GuidRepresentation
specified.
Do you have any suggestion on how to fix this? The exception makes sense, and normally we'd probably suggest statically registering a serializer, but this would impact all the other tests here.
Should we ignore the Guid
field for this test and make a separable test here that can only be run manually?
This would need to be done also for other operators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's because on line 292 this testuses the string form of the field name and specified "x"
and there is no field called 'x"
to pick up the correctly configured GuidSerializer
from.
I would simply skip this part of the test for Guid
:
// in this test there is no property called "x" to pick up a properly configured GuidSerializer from
if (typeof(T) != typeof(Guid))
{
AssertRendered(
subject.Equals("x", value),
$"{{ equals: {{ path: 'x', value: {valueRendered} }} }}");
var scoreBuilder = new SearchScoreDefinitionBuilder<BsonDocument>();
AssertRendered(
subject.Equals("x", value, scoreBuilder.Constant(1)),
$"{{ equals: {{ path: 'x', value: {valueRendered}, score: {{ constant: {{ value: 1 }} }} }} }}");
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I would have loved to avoid making special cases, but it seems we need to. I'll make some additional tests that can be run manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with just skipping this test for Guids, because:
- This test is against
IMongoCollection<BsonDocument>
and that is probably not used often - Anyone using
BsonDocument
should either not be usingGuid
or should take control of the conversion themselves
A manual test would involve registering a properly configured global GuidSerializer
, but we already know that works (in part because the strongly typed version uses a properly configured GuidSerializer
, though it's true it doesn't get it from the global registry).
@@ -141,6 +141,7 @@ public static IBsonSerializer GetSerializerForValueType(IBsonSerializer fieldSer | |||
return ConvertIfPossibleSerializer.Create(valueType, fieldType, fieldSerializer, serializerRegistry); | |||
} | |||
|
|||
//TODO Never used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove this?
new object[] { new[] { ObjectId.Empty, ObjectId.Parse("4d0ce088e447ad08b4721a37") }, new[] { "{ $oid: '000000000000000000000000' }", "{ $oid: '4d0ce088e447ad08b4721a37' }" } }, | ||
new object[] { new object[] { (byte)1, (short)2, (int)3 }, new[] { "1", "2", "3" } } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to support this? This does not work anymore because before we were using the serializer depending on the type of the value, while now we're trying to look at the serializer appropriate for the field.
@@ -123,9 +123,9 @@ private protected enum OperatorType | |||
QueryString, | |||
Range, | |||
Regex, | |||
Search, | |||
Search, //TODO This is never used, should we remove them? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like these two can be safely removed, should we do it?
@@ -119,7 +119,7 @@ public SearchDefinition<TDocument> EmbeddedDocument<TField>( | |||
/// <param name="score">The score modifier.</param> | |||
/// <returns>An equality search definition.</returns> | |||
public SearchDefinition<TDocument> Equals<TField>( | |||
FieldDefinition<TDocument, TField> path, | |||
FieldDefinition<TDocument, TField> path, //TODO We should try to uniform this to the In operator for next major (this should be SearchPathDefinition<TDocument>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can create a ticket for this if we want to go this way
@@ -118,38 +119,88 @@ private protected override BsonDocument RenderArguments(RenderArgs<TDocument> ar | |||
|
|||
internal sealed class EqualsSearchDefinition<TDocument, TField> : OperatorSearchDefinition<TDocument> | |||
{ | |||
private readonly BsonValue _value; | |||
private readonly TField _value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TField
and TValue
are only the same when the field is not an array.
In the case of an array TField
is IEnumerable<TValue>
.
@@ -118,38 +119,88 @@ private protected override BsonDocument RenderArguments(RenderArgs<TDocument> ar | |||
|
|||
internal sealed class EqualsSearchDefinition<TDocument, TField> : OperatorSearchDefinition<TDocument> | |||
{ | |||
private readonly BsonValue _value; | |||
private readonly TField _value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you remove the code you had earlier for _arrayField
?
@@ -322,6 +323,12 @@ public IEnumerableSerializer(IBsonSerializer<TItem> itemSerializer) | |||
_itemSerializer = itemSerializer; | |||
} | |||
|
|||
public bool TryGetItemSerializationInfo(out BsonSerializationInfo serializationInfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep methods in alphabetical order?
{ | ||
var searchPathDefinition = (SingleSearchPathDefinition<TDocument>)_path; | ||
var renderedField = searchPathDefinition.Field.Render(args); | ||
var serializer = renderedField.FieldSerializer switch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename serializer
to valueSerializer
.
To remove confusion about whether it is the field serializer or the value serializer (which are not the same in the case of array fields).
@@ -118,38 +121,35 @@ private protected override BsonDocument RenderArguments(RenderArgs<TDocument> ar | |||
|
|||
internal sealed class EqualsSearchDefinition<TDocument, TField> : OperatorSearchDefinition<TDocument> | |||
{ | |||
private readonly BsonValue _value; | |||
private readonly TField _value; | |||
|
|||
public EqualsSearchDefinition(FieldDefinition<TDocument> path, TField value, SearchScoreDefinition<TDocument> score) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unfortunate that path
is of type FieldDefinition<TDocument>
instead of FieldDefinition<TDocument, TField>
.
This is not type safe because it allows mismatches between the field type and the value type that are not caught at compile time.
In an earlier commit you had changed this but now it's removed. Can you explain why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to uniform the code between the Equals
, In
and Range
search definition classes.
Unfortunately for In
and Range
we can't have the same constructor with FieldDefinition<TDocument, Field>
because of the current public API of the operators in SearchDefinitionBuilder
.
I agree that this is not type safe, but at least EqualsSearchDefinition
is an internal class, and the operators in SearchDefinitionBuilder
that use it have type safety (either through expressions or field definitions).
If this was a major version, I'd suggest to modify SingleSearchPathDefinition
(the subclass of SearchPathDefinition
for single fields) from using one generic parameter TDocument
to two TDocument, TField
, so we could be on the safer side.
Probably it would be worth actually removing SearchPathDefinition<TDocument>
as an input parameter of the operators, and create multiple overrides for the various subclasses of SearchPathDefinition
, so that we can use as many generic parameters as necessary.
I think this is something that would be worth investigating for next major in order to improve the API, but I don't think we can do much at the moment without making breaking changes.
(We also have MultiSearchPathDefinition
, and that's another thing to consider when investigating what we can do with the API)
AssertRendered( | ||
subjectTyped.In(p => p.Hobbies, ["dance", "ski"]), | ||
"{ in: { path: 'hobbies', value: ['dance', 'ski'] } }"); | ||
} | ||
|
||
[Fact] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the mongo docs In
should support wildcard paths, but we had no tests for that.
JIRA ticket