Construction Accident Prevention and Response Generation: Hansol Deco Season 3 Generative AI Competition

Algorithm | NLP | Generate AI | LLM | MLOps | Similarity

 

baseline 실행 중, ValueError 해결방법 질문

2025.02.20 09:29 2,086 Views

베이스라인 코드에서 이런 오류가 발생합니다.


transformers, accelerate, bitsandbytes 최신 업데이트도 다 해보고


python version 3.12 뿐만 아니라 3.11, 3.8 에서도 전부 똑같은 에러가 발생합니다.


이것을 어떻게 해결해야 하는지 전문가님들의 해결책을 구합니다.


감사합니다.


bnb_config = BitsAndBytesConfig(

  load_in_4bit=True,

  bnb_4bit_use_double_quant=True,

  bnb_4bit_quant_type="nf4",

  bnb_4bit_compute_dtype=torch.bfloat16

)


# model_id = "ybelkada/falcon-7b-sharded-bf16"

model_id = "NCSOFT/Llama-VARCO-8B-Instruct"


tokenizer = AutoTokenizer.from_pretrained(model_id)

model = AutoModelForCausalLM.from_pretrained(model_id, 

                       quantization_config=bnb_config, 

                       device_map='auto',

                      torch_dtype=torch.bfloat16,

                      trust_remote_code=True)


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[7], line 6
      3 # model_id = 'Bllossom/llama-3.2-Korean-Bllossom-3B'
      5 tokenizer = AutoTokenizer.from_pretrained(model_id)
----> 6 model = AutoModelForCausalLM.from_pretrained(model_id, 
      7                                              quantization_config=bnb_config, 
      8                                              device_map='auto',
      9                                             torch_dtype=torch.bfloat16,
     10                                             trust_remote_code=True)
     12 model.to('cpu')
     13 if torch.cuda.is_available():

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\transformers\models\auto\auto_factory.py:563, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    561 elif type(config) in cls._model_mapping.keys():
    562     model_class = _get_model_class(config, cls._model_mapping)
--> 563     return model_class.from_pretrained(
    564         pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
    565     )
    566 raise ValueError(
    567     f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.\n"
    568     f"Model type should be one of {', '.join(c.__name__ for c in cls._model_mapping.keys())}."
    569 )

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\transformers\modeling_utils.py:3820, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs)
   3818         device_map_kwargs["force_hooks"] = True
   3819     if not is_fsdp_enabled() and not is_deepspeed_zero3_enabled():
-> 3820         dispatch_model(model, **device_map_kwargs)
   3822 if hf_quantizer is not None:
   3823     hf_quantizer.postprocess_model(model)

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\accelerate\big_modeling.py:496, in dispatch_model(model, device_map, main_device, state_dict, offload_dir, offload_index, offload_buffers, skip_keys, preload_module_classes, force_hooks)
    494     device = f"xpu:{device}"
    495 if device != "disk":
--> 496     model.to(device)
    497 else:
    498     raise ValueError(
    499         "You are trying to offload the whole model to the disk. Please use the `disk_offload` function instead."
    500     )

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\transformers\modeling_utils.py:2702, in PreTrainedModel.to(self, *args, **kwargs)
   2700 # Checks if the model has been loaded in 8-bit
   2701 if getattr(self, "quantization_method", None) == QuantizationMethod.BITS_AND_BYTES:
-> 2702     raise ValueError(
   2703         "`.to` is not supported for `4-bit` or `8-bit` bitsandbytes models. Please use the model as it is, since the"
   2704         " model has already been set to the correct devices and casted to the correct `dtype`."
   2705     )
   2706 elif getattr(self, "quantization_method", None) == QuantizationMethod.GPTQ:
   2707     # For GPTQ models, we prevent users from casting the model to another dytpe to restrict unwanted behaviours.
   2708     # the correct API should be to load the model with the desired dtype directly through `from_pretrained`.
   2709     dtype_present_in_args = False

ValueError: `.to` is not supported for `4-bit` or `8-bit` bitsandbytes models. Please use the model as it is, since the model has already been set to the correct devices and casted to the correct `dtype`.
Login Required
0 / 1000
IIllIIllIIll
2025.02.20 11:09

혹시 코드에 model.to('cpu') 를 추가하셨나요? .to 메서드 때문에 나는 것 같습니다
device_map='auto'에서 이미 적절한 디바이스로 설정됬는데 추가로 model.to('cpu')가 호출되면서 발생하는 문제로 보입니다.
bnb로 양자화된 모델은 to 메서드를 사용할 수 없습니다. (에러문 참고 : `.to` is not supported for `4-bit` or `8-bit` bitsandbytes models.)

용용죽겠지
2025.02.20 13:26

추가 안했습니다 ㅠ. 그냥 baseline 그대로 따라했는데 이런 오류가 난 것입니다. 영어 해석상에는 .to('cpu')를 했다고 뜨긴 하는데... 아무리 찾아봐도 그런 코드가 없습니다. 그냥 그대로 따라했는데 오류가 납니다

IIllIIllIIll
2025.02.20 13:59

으음 그럼 양자화 설정을 빼고 돌려보거나 아니면 device_map도 직접 설정해보면서 어떤 부분이 문제인지 디버깅해봐야 알거같은데요 ㅠㅠ

용용죽겠지
2025.02.20 14:09

device_map은 'auto', 'cpu','cuda:0', '{:0}' 다 해봤는데 다 똑같은 오류가 발생했었습니다. 양자화 설정을 빼고 돌리니 돌아가긴 하는데, 4배정도 시간이 더 걸린다고 하니.. . ㅠ

IIllIIllIIll
2025.02.20 14:21

그럼 양자화가 없어야 돌아간다는건 GPU가 제대로 잡혀있지 않거나 CUDA버전 문제일수도 있을것같은데요???

용용죽겠지
2025.02.20 14:47

cuda 11.x 부터 12.6 까지 호환된다고 되어있던데, 제가 12.6씁니다.

IIllIIllIIll
2025.02.20 15:07

ㅠㅠ이상하네요 오류메세지보면 model이 cpu로 가는거같은데

Sonny_S.W
2025.02.20 15:29

1. 먼저 GPU가 잡히는지 확인한다
2. https://github.com/huggingface/accelerate/issues/2129를 참고해서 코드를 수정한다.
-7번 셀의 'model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto")'의 괄호 안에 
+ offload_folder=folder_offload, 
+ offload_state_dict=True
위 두개의 파라미터를 삽입해보는 것을 권합니다./

Sonny_S.W
2025.02.20 15:30

즉, 
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto", offload_folder=folder_offload, offload_state_dict=True)
로 바꿔서 한번 해보시죠

Sonny_S.W
2025.02.20 15:32

model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, trust_remote_code=True, low_cpu_mem_usage = True).cpu()    

from accelerate import disk_offload
disk_offload(model=model, offload_dir=folder)

이 내용도 있네요

용용죽겠지
2025.02.20 16:23

offload_folder="offload" 이거를 추가해서 했을 때는 똑같은 에러가 발생했었는데,
한 번 해보겠습니다. 감사합니다

Redix6
2025.02.20 16:33

model = AutoModelForCausalLM.from_pretrained(model_id, 
                       quantization_config=bnb_config, 
                       device_map='auto',
                      torch_dtype=torch.bfloat16,
                      trust_remote_code=True)

에서 torch_dtype=torch.bfloat16 이 quantization이랑 겹쳐서 꼬인게 아닐까 싶은데
정확히 파악 하려면 저 에러난 코드들 따라서 추적하거나, 최대한 default 설정으로 한 다음 옵션을 하나씩 추가해보세요.

Previous
팀원모집
Competition - 건설공사 사고 예방 및 대응책 생성 : 한솔데코 시즌3 생성 AI 경진대회
Likes 8
Views 1,122
Comments 1
6mo ago
Current
baseline 실행 중, ValueError 해결방법 질문
Competition - 건설공사 사고 예방 및 대응책 생성 : 한솔데코 시즌3 생성 AI 경진대회
Likes 7
Views 2,086
Comments 12
6mo ago
Next
No Next Post