Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RegExpExecArray does not match ArraySchema #946

Open
Kritolan opened this issue Nov 30, 2024 · 10 comments
Open

RegExpExecArray does not match ArraySchema #946

Kritolan opened this issue Nov 30, 2024 · 10 comments
Assignees
Labels
question Further information is requested

Comments

@Kritolan
Copy link

Kritolan commented Nov 30, 2024

I just want to parse a specific value from a string using RegExp. This works, but TS complains about types.

import * as v from 'valibot';

const Schema = v.pipe(
  v.string(),
  // correct string - 2024.01.30
  v.transform(value => /^(\d{4})\.\d{2}\.\d{2}$/.exec(value)),
  v.array(v.string()),
  v.length(2),
  v.transform(value => Number(value[1])),
);

const result = v.safeParse(Schema, '2024.01.30');
console.log(result);
@EltonLobo07
Copy link
Contributor

EltonLobo07 commented Nov 30, 2024

This is because of the use of v.array(v.string()) after the transformation. A pipeline schema will try to narrow the base type to a specific type (a type that can be assigned to the base type). Some examples:

const GoodSchema = v.pipe(
  v.union([v.number(), v.string()]), 
  // `number` (specific type) is assignable to `number` | `string` (base type)
  v.number()
);

// The type checker won't accept this schema
const BadSchema = v.pipe(
  v.union([v.number(), v.string()]),
  // `null` is not assignable to `number` | `string`
  v.null(),
);

In the example associated with this issue, Array<string> is not assignable to RegExpExecArray | null. This example might help understand the issue better:

// The type checker generates a type error because this assignment is not safe
const test: RegExpExecArray | null = new Array<string>();

Based on how the pipeline is set up in the example associated with this issue, I recommend using this schema:

const Schema = v.pipe(
  v.string(),
  // correct string - 2024.01.30
  v.transform(value => /^(\d{4})\.\d{2}\.\d{2}$/.exec(value) ?? new Array<string>()),
  v.length(2),
  v.transform(value => Number(value[1])),
);

@Kritolan
Copy link
Author

Kritolan commented Dec 2, 2024

Thanks, this helped. However, this doesn't seem very intuitive.

Initially, I expected the schema check to always accept unknown as input, and then check the type and throw an error. This looks logical and quite convenient because my example looks clearer than yours. Is it possible to bring existing behavior to this?

@EltonLobo07
Copy link
Contributor

I think the current pipeline behavior is important because it helps avoid creating impossible schemas. Example:

// Type error because: A value cannot be associated with a `string` and `number` type at the same time (impossible)
const Schema = v.pipe(v.string(), v.number());

But if you are sure you won't create a schema that always generates a runtime error (like in the example associated with this issue), you can return unknown from the transform action.

const Schema = v.pipe(
  v.string(),
  // This `transform` action returns `unknown` now
  v.transform((value): unknown => /^(\d{4})\.\d{2}\.\d{2}$/.exec(value)),
  v.array(v.string()),
  v.length(2),
  v.transform(value => Number(value[1])),
);

Will this help?

@Kritolan
Copy link
Author

Kritolan commented Dec 2, 2024

I like this solution better and will use it for now.

What about the current behavior - is it really such a problem? After all, this also doesn't allow using completely valid schemes:

const Schema = v.pipe(
  v.looseObject({test1: v.string()}),
  v.object({test2: v.string()}),
);

You can make this behavior configurable. Or check for such incompatibility in runtime and throw issue.

@EltonLobo07
Copy link
Contributor

EltonLobo07 commented Dec 3, 2024

I think there is always a way to create schemas with the current pipe behavior. For example, the mentioned example can be restructured to:

const Schema = v.object({test1: v.string(), test2: v.string()});

Making the behavior configurable is possible, but I am not sure if that would be ideal. @fabian-hiller Can you share your thoughts?

Also, I think checking for incompatibility at runtime and throwing an issue will:

  1. Be expensive and impossible (maybe)
    Users have the freedom to create very complex types, which make incompatibility checks complex or nearly impossible.
  2. Defeat the purpose of having a type checker
    The purpose of a type checker is to catch a lot of potential bugs at compile time. Allowing users to create invalid schemas and then throwing an issue at runtime is the same as allowing users to call one of string methods on a number value (Example: (12).toUpperCase()). Basically, we are back to JavaScript.

Fabian might be busy for a few days. So let's keep this issue open till we get a response.

@fabian-hiller
Copy link
Owner

I expected the schema check to always accept unknown as input, and then check the type and throw an error.

This is only necessary for the initial unknown input, but in your case the return type if transform is known. There is basically no need to use a schema function for validation. Your transformation should not return an invalid data type. I recommend using something like regex before transform to make sure that the transformation always returns a valid output.

I would rewrite your schema:

import * as v from 'valibot';

const DATE_REGEX = /^\d{4}\.\d{2}\.\d{2}$/u;

const Schema = v.pipe(
  v.string(),
  v.regex(DATE_REGEX),
  v.transform((input) => DATE_REGEX.exec(input)![1])
);

@fabian-hiller fabian-hiller added the question Further information is requested label Dec 4, 2024
@ajotaos
Copy link

ajotaos commented Dec 5, 2024

What about an action that validates a a regex but also returns a regex or its captures? I have something alike by using rawTranform to avoid evaluating the regex twice.

@fabian-hiller
Copy link
Owner

If you or anyone else has a good idea for naming and functionality, I am happy to consider such an action. Note that you can always write your own action.

@ajotaos
Copy link

ajotaos commented Dec 16, 2024

This is a very opinionated version because of the type of Regex we're using which is named captures, but something alike maybe:

import * as v from 'valibot';

export function captureRegex(
	regex: RegExp,
	message?: v.ErrorMessage<v.RawTransformIssue<string>> | undefined,
): v.RawTransformAction<string, Record<string, string>> {
	return v.rawTransform(({ dataset, addIssue, NEVER }) => {
		const match = regex.exec(dataset.value);

		if (!match) {
			addIssue({
				expected: String(regex),
				message:
					message ??
					((issue) =>
						`Invalid format: Expected ${String(regex)} but received "${issue.input}"`),
			});

			return NEVER;
		}

		return match.groups ?? {};
	});
}

@fabian-hiller
Copy link
Owner

Thank you for sharing! I will leave this issue open to see if more people request such an action. Feel free to add a 👍 to this comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants