[OT] "Shrinking" IDs in SQL Server

Hello I was hoping some of the SQL experts here could help with a problem I am facing. Currently have covid too so perhaps I'm not thinking clearly... Long story short, is there a way in SQL Server to convert strings (varchars) of any length to exactly 32, keeping uniqueness? It would also be nice if I could convert them back to ensure I haven't lost information. Briefly, the reason I want to do this is my target is varchar(32) but the IDs I am dealing with are made up of composite business keys of larger length. I need to be able to use the current business keys and generate unique 32 character equivalents in a deterministic way. Appreciate any advice. Thanks Tom

Hi Tom, So many questions about what you’re trying to do (sounds really odd), but I think the safest for you right now would be to just select the current key and the truncated value out into another table while you take a look at them. Regards, Greg Dr Greg Low 1300SQLSQL (1300 775 775) office | +61 419201410 mobile SQL Down Under | Web: https://sqldownunder.com<https://sqldownunder.com/> | About Greg: https://about.me/greg.low From: Tom P via ozdotnet <ozdotnet@ozdotnet.com> Sent: Tuesday, 14 March 2023 9:52 AM To: ozDotNet <ozdotnet@ozdotnet.com> Cc: Tom P <tompbiz93@gmail.com> Subject: [OT] "Shrinking" IDs in SQL Server Hello I was hoping some of the SQL experts here could help with a problem I am facing. Currently have covid too so perhaps I'm not thinking clearly... Long story short, is there a way in SQL Server to convert strings (varchars) of any length to exactly 32, keeping uniqueness? It would also be nice if I could convert them back to ensure I haven't lost information. Briefly, the reason I want to do this is my target is varchar(32) but the IDs I am dealing with are made up of composite business keys of larger length. I need to be able to use the current business keys and generate unique 32 character equivalents in a deterministic way. Appreciate any advice. Thanks Tom

To "attempt" to guarantee uniqueness you could try hashing, but it's one way only and you might get clashes. On Tue, 14 Mar 2023, 9:55 am Tom P via ozdotnet, <ozdotnet@ozdotnet.com> wrote:
Hello
I was hoping some of the SQL experts here could help with a problem I am facing. Currently have covid too so perhaps I'm not thinking clearly...
Long story short, is there a way in SQL Server to convert strings (varchars) of any length to exactly 32, keeping uniqueness? It would also be nice if I could convert them back to ensure I haven't lost information.
Briefly, the reason I want to do this is my target is varchar(32) but the IDs I am dealing with are made up of composite business keys of larger length. I need to be able to use the current business keys and generate unique 32 character equivalents in a deterministic way.
Appreciate any advice.
Thanks Tom -- ozdotnet mailing list To manage your subscription, access archives: https://codify.mailman3.com/

Hi Tom, I think this is a maths problem more than a SQL one. Good fun! At first I thought "just MD5 hash the original string to 16 bytes and store the 32 hex chars". Then you said you'd like to convert back, which sank my idea. The only option left is compression, but I'll bet it would be a miracle if all of your original strings could be roundtripped via 32 compressed characters. *Greg K*

The converting back was just a nice to have so I could validate things like a unit test. I'll look into hashing though. Thanks Tom On Tue, 14 Mar 2023 at 10:17, Greg Keogh via ozdotnet <ozdotnet@ozdotnet.com> wrote:
Hi Tom, I think this is a maths problem more than a SQL one. Good fun!
At first I thought "just MD5 hash the original string to 16 bytes and store the 32 hex chars".
Then you said you'd like to convert back, which sank my idea. The only option left is compression, but I'll bet it would be a miracle if all of your original strings could be roundtripped via 32 compressed characters.
*Greg K* -- ozdotnet mailing list To manage your subscription, access archives: https://codify.mailman3.com/

Possibly both. Use the hash for your original table, and setup a secondary table with hash as the key, and a zipped file as data. But you'd need to have some provision for the inevitable collisions. On Tue, 14 Mar 2023, 09:47 Greg Keogh via ozdotnet, <ozdotnet@ozdotnet.com> wrote:
Hi Tom, I think this is a maths problem more than a SQL one. Good fun!
At first I thought "just MD5 hash the original string to 16 bytes and store the 32 hex chars".
Then you said you'd like to convert back, which sank my idea. The only option left is compression, but I'll bet it would be a miracle if all of your original strings could be roundtripped via 32 compressed characters.
*Greg K* -- ozdotnet mailing list To manage your subscription, access archives: https://codify.mailman3.com/

Possibly both. Use the hash for your original table, and setup a secondary table with hash as the key, and a zipped file as data. But you'd need to have some provision for the inevitable collisions.
A 16 byte (128 bit) hash should be practically safe from collisions. See https://en.wikipedia.org/wiki/Birthday_attack (table about half way down) -- *GK*
participants (5)
-
Dr Greg Low
-
Greg Keogh
-
mike smith
-
Tom P
-
Tony Wright