Astro - Hacker News

14 comments

jandrewrogers 18 minutes ago

A “simplest” hash function is completely dependent on what you are using the hash function for and the guarantees you want a hash function to make. An optimal permutation of an integer is different from a small string hash is different from a block checksum. Literally, you are optimizing the algorithm for entirely different properties. No algorithm can satisfy all of them even approximately.
The full scope of things hash functions are commonly used for requires at least four algorithms if you care about performance and optimality. It is disconcertingly common to see developers using hash algorithms in contexts where they are not fit for purpose. Gotta pick the right tool for the job.
Charon77 an hour ago

> Like addition
I'm perplexed to the claim that addition is cheaper than XOR, especially since addition is built upon XOR, am I missing anything? Is it javascript specific?
[-]
- pjscott 12 minutes ago
  
  The wording was a bit unclear. The previous paragraph mentions wanting something cheaper than "those pesky XORs and multiplications". The multiplication is the expensive part; the (very cheap) XORs are just mildly annoying because you have to think about what they're doing.
- 4k0hz 12 minutes ago
  
  At least on x86, multiple additions and multiplications can be done with a single `lea` instruction so it's preferable to XOR. Though I have no idea about other architectures, compiler implementations, any interpreters...
  [-]
  - ygra 3 minutes ago
    
    That only helps with multiplications by statically known word sizes (4x, 8x, etc.) and not arbitrary x·y. It can help with many smaller constant multipliers if the complete is clever, but it has to be known at compile time.
johanvts 37 minutes ago

This is a excellent article for anyone looking for some more in-depth analysis of tabulation based hashing methods: https://arxiv.org/abs/1505.01523
bryanrasmussen 2 hours ago

a hash function that produce hashes that are already in the hash table should, IMO, not be called a hash function.
I get why technically it is a hash function, but still, no.
[-]
- baruch an hour ago
  
  It is mathematically impossible for a proper hash function (one with an output range smaller than its input range) to not have collisions. The proof uses the Pigeon Hole Principle https://en.wikipedia.org/wiki/Pigeonhole_principle
  [-]
  - bryanrasmussen 5 minutes ago
    
    ok damn, I did not know this, obviously. Thanks.
- ahazred8ta 2 hours ago
  
  A perfect hash function https://en.wikipedia.org/wiki/Perfect_hash_function has to be specially constructed for each desired set of inputs. Generic hash functions cannot be 'perfect'.
- phinnaeus 2 hours ago
  Here is a hash function that does not have hash collisions:
```
  fn hash(data):
    return data
```
  [-]
  - oliver236 10 minutes ago
    
    what programming language is this?
  - Charon77 an hour ago
    
    Well it no longer constrains the data in a fixed output length.
    
    [-]
    
    dbdr 36 minutes ago
    
    Sure, but if you constrain to fixed output length, you will definitely have collisions (Pigeon Hole Principle). There's no way around that.