By clicking "Accept", you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. See our Privacy Policy for more information
A technique for compressing AI models that involves reducing the precision of parameters (for example, going from 32 bits to 8 bits) to decrease model size and speed up inference, often used in embedded applications.