Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<locale>: std::collate_byname<_Elem>::hash() yields different hashes for strings that collate the same #5212

Open
muellerj2 opened this issue Dec 29, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@muellerj2
Copy link
Contributor

[locale.collate.virtuals]/3 specifies that collate<_Elem>::do_hash() returns the same hash for all strings that collate the same. However, collate_byname<_Elem>::(do_)hash() does not produce such hashes for non-C locales.

Test case

#include <iostream>
#include <locale>

using namespace std;

int main() {
	const locale loc("de_DE");
	auto& coll = use_facet<collate<wchar_t>>(loc);
	const wchar_t ex1[] = L"Straße";
	const wchar_t ex2[] = L"Strasse";

	cout << "collate the same: " << (coll.compare(ex1, ex1 + size(ex1) - 1, ex2, ex2 + size(ex2) - 1) == 0) << '\n';
	cout << "hash the same: " << (coll.hash(ex1, ex1 + size(ex1) - 1) == coll.hash(ex2, ex2 + size(ex2) - 1));
	return 0;
}

prints

collate the same: 1
hash the same: 0

Godbolt link

Expected result

This should print

collate the same: 1
hash the same: 1

Additional remarks

For non-C locales, I think the hash function should essentially do:

return hash(transform(_First, _Last));

Alternatively, LCMapStringA/W/Ex with LCMAP_HASH could be used. It's probably faster, but LCMAP_HASH is not guaranteed to produce the same hash for all strings that collate the same according to the API documentation, so it seems this also wouldn't fully conform to [locale.collate.virtuals]/3.

@CaseyCarter CaseyCarter added the bug Something isn't working label Dec 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants