Shin, Jae Wook: 2024

Saturday, November 16, 2024

LLM - Sampling Temperature, Top K, and Top P, What & How?

In the LLM, using sampling

When we set the temperature, top K, and top P in the LLM, do you know what they are and what the order for applying them is?

actual order: Temperature -> Top K -> Top P

arXiv

The Curious Case of Neural Text Degeneration (Nucleus Sampling)

Repetition penalties are terribly implemented - A short explanation and solution

https://www.reddit.com/r/LocalLLaMA/comments/1g383mq/repetition_penalties_are_terribly_implemented_a/?rdt=41955

Thursday, October 10, 2024

PyTorch - get the total number of model parameter

Total number of model parameters

1. simple version

pytorch_total_params = sum(p.numel() for p in model.parameters())

2. listed version

def count_parameters(model):

str_name = "name"

str_parameter = "parameter"

print(f"{str_name:50s}: {str_parameter:10s}")

total_params = 0

for name, parameter in model.named_parameters():

if not parameter.requires_grad:

continue

params = parameter.numel()

print(f"{name:50s}: {params:10s}")

total_params += params

print(f"Total Trainable Params: {total_params}")

return total_params

PIL - image resize() from the basic code & idea

PIL: Python Image Library: https://github.com/python-pillow/

Written by Fredrik Lundh(https://groups.google.com/g/dev-python/c/fbv5gWgaGpM?pli=1)

image resize algorithm

https://en.wikipedia.org/wiki/Image_scaling
https://en.wikipedia.org/wiki/Affine_transformation#Examples

affine transformation scaling
Please refer to "scale about image" image

https://en.wikipedia.org/wiki/Bicubic_interpolation

https://en.wikipedia.org/wiki/Bicubic_interpolation#/media/File:Comparison_of_1D_and_2D_interpolation.svg

Pillow(PIL) Image resize NEAREST behavior differs depending on the version
Resizing Images With Bicubic Interpolation

On Device AI - sLLM (or SLM)

sLLM (small Large Language Model) or SLM (Small Language Model)

What is the scheduler in the Stable Diffusion?

1. what is the scheduler?

2. sampler or scheduler?

Monday, July 29, 2024

Transformer step by step

Original Paper: Attention Is All You Need

Monday, July 22, 2024

About Quantization

Quantization

https://paperswithcode.com/task/quantization

Vector Quantization

& Maarten Greentendorst

- Quantization FP16 FP8, and INT8

Apple, WWDC

Sunday, July 14, 2024

how to convert array to vector in c++ fast way

array to vector

#include <vector>

constexpr int vec_size = 5;

float a[vec_size] = {0, 1, 2, 3, 4};

std::vector<float> vec_a(a, a + vec_size); // good

https://sites.google.com/site/hashemian/home/tips-and-tricks/copy-array-cpp
https://stackoverflow.com/questions/8777603/what-is-the-simplest-way-to-convert-array-to-vector
https://www.freecodecamp.org/news/cpp-vector-how-to-initialize-a-vector-in-a-constructor/ <- how to initialize a vector from an array in C++

And vector to array

#include <vector>

constexpr int vec_size = 5;

float a[vec_size] = {0, 1, 2, 3, 4};

std::vector<float> vec_a(a, a + vec_size); // good

#include <algorithm>

float b[vec_size] = {}

std::copy(vec_a.begin(), vec_a.end(), b); // good

Thursday, July 4, 2024

Python float to hexadecimal & hexadecimal to float, and default flaot vs. fp32

import struct

def float_to_hex(f):

return hex(struct.unpack('<I', struct.pack('<f', f))[0])

def hex_to_float(h):

return struct.unpack('!f', bytes.fromhex(h))[0]

hex_val = "0xbf557ca4"

float_val = hex_to_float(hex_val.replace("0x", ""))

print(f"-0.8339 -> 0xbf557ca4 -> {float_val} <- -0.8339331150054932")

----

-0.8339 -> 0xbf557ca4 -> -0.8339331150054932 <- -0.8339331150054932

import numpy as np

fp32_value = np.float32(-0.8339)

print(f"-0.8339 -> fp32: {fp32_value}")

----

output: -0.8339 -> fp32: -0.833899974822998

fp64_value = -0.8339

print(f"-0.8339 -> fp64: {fp64_value}")

----

output: -0.8339 -> fp64: -0.8339

Monday, June 24, 2024

Quantization FP16, FP8, or INT8

In this article, you can download many papers.

Floating-point arithmetic for AI inference — hit or miss?

Sunday, June 23, 2024

About LoRA

LoRA (Low-Rank Adaptation)

- Paper

[Current]

[Before]

Measuring the Intrinsic Dimension of Objective Landscapes, 2018
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning, 2020

[After]

Towards A Unified View of Parameter-Efficient Transfer Learning, 2022

----

- YouTube

https://www.youtube.com/watch?v=dA-NhCtrrVE

https://www.youtube.com/watch?v=BJqwmDpa0wM

https://www.youtube.com/watch?v=t509sv5MT0w

----

The case for 4-bit precision: k-bit Interence Scaling Laws

Parameter-Efficient Transfer Learning for NLP

Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning

QLoRA: Efficient Finetuning of Quantized LLMs

https://www.youtube.com/watch?v=X4VvO3G6_vw

----

Intrinsic Dimensionality Ezplanins the Effectiveness of Language Model Fine-Tuning

Friday, April 26, 2024

VLA: variable length array

VLA: variable length array Variable-length array - Wikipedia

and

c - What's the difference between variable length array and dynamic memory allocation? - Stack Overflow

VLA: allocate memory in stack area

malloc(): allocate memory in heap area

Saturday, April 13, 2024

Turbo encoder / decoder

From the Web

- PSI00073.pdf (ucsc.edu)

- Microsoft Word - From BCJR to turbo_v3 (up.pt)

- AWCNS.dvi (southampton.ac.uk)

---

(Korean)

3세대 이동통신에 적합한 Log-MAP 터보 디코더 설계 (catholic.ac.kr)

Implementation of Turbo Decoder using SOVA Algorithm - 대한전자공학회 ISOCC - 대한전자공학회 : 논문 - DBpia

[논문]3세대 이동통신에 적합한 슬라이딩 윈도우 로그 맵 터보 디코더 설계 (kisti.re.kr)

An efficient method for Turbo Decoder design using Block Combining -Proceedings of the IEEK Conference | Korea Science

[논문]다양한 Design Issue에 대한 터보 디코더의 성능분석 (kisti.re.kr)

매체보기 | 서강대학교 로욜라도서관 (sogang.ac.kr)

Shin, Jae Wook

Saturday, November 16, 2024

LLM - Sampling Temperature, Top K, and Top P, What & How?

Thursday, October 10, 2024

PyTorch - get the total number of model parameter

Friday, September 13, 2024

PIL - image resize() from the basic code & idea

Saturday, August 3, 2024

On Device AI - sLLM (or SLM)

What is the scheduler in the Stable Diffusion?

Monday, July 29, 2024

Transformer step by step

Monday, July 22, 2024

About Quantization

Sunday, July 14, 2024

how to convert array to vector in c++ fast way

Thursday, July 4, 2024

Python float to hexadecimal & hexadecimal to float, and default flaot vs. fp32

Monday, June 24, 2024

Quantization FP16, FP8, or INT8

Sunday, June 23, 2024

About LoRA

Friday, April 26, 2024

VLA: variable length array

Saturday, April 13, 2024

Turbo encoder / decoder

Blog Archive

About Me

Saturday, November 16, 2024

Thursday, October 10, 2024

Friday, September 13, 2024

Saturday, August 3, 2024

Monday, July 29, 2024

Monday, July 22, 2024

Sunday, July 14, 2024

Thursday, July 4, 2024

Monday, June 24, 2024

Sunday, June 23, 2024

Friday, April 26, 2024

Saturday, April 13, 2024

Subscribe To

Blog Archive

About Me