Spaces:

jeanbaptdzd
/

open-finance-llm-8b

Paused

App Files Files Community

open-finance-llm-8b / app

Commit History

fix: vLLM tool calling - enable by default with hermes parser

7239fe3

jeanbaptdzd commited on 17 days ago

Align HF Space API response format with vLLM

6393558

jeanbaptdzd commited on 18 days ago

Improve reasoning tag removal for unclosed tags

8c38d11

jeanbaptdzd commited on 21 days ago

Add reasoning tag removal for all chat responses

b1e1444

jeanbaptdzd commited on 21 days ago

Fix dict access for inputs after device placement

4bee8ff

jeanbaptdzd commited on 21 days ago

Fix device placement for tokenizer outputs before model inference

64c014e

jeanbaptdzd commited on 21 days ago

Add error handling for invalid log level configuration

7ee7723

jeanbaptdzd commited on 21 days ago

Refactor: Address code shortcomings and align with HF best practices

dc14519

jeanbaptdzd commited on 21 days ago

Remove chat_service.py abstraction layer

c77ec91

jeanbaptdzd commited on 24 days ago

Set temperature=0 for JSON format output (greedy decoding)

78ed4ff

jeanbaptdzd commited on 24 days ago

Fix temperature modification: only apply to JSON format, not tools

c898602

jeanbaptdzd commited on 24 days ago

Improve structured output: lower temperature for JSON/tool calls, remove unused stopping criteria

90a906d

jeanbaptdzd commited on 24 days ago

Strengthen prompts with examples for tool calls and JSON format

cb7f3d3

jeanbaptdzd commited on 24 days ago

Improve tool call parsing: handle reasoning tags and extract JSON tool calls

4a04968

jeanbaptdzd commited on 24 days ago

Fix reasoning tag: use <think> instead of <think>

d730034

jeanbaptdzd commited on 24 days ago

Fix reasoning tag handling: better support for unclosed <think> tags

a5e663f

jeanbaptdzd commited on 24 days ago

Strengthen JSON format instructions: more explicit and in English

d39e295

jeanbaptdzd commited on 24 days ago

Fix reasoning tag: use correct <think> tag pattern

682f9cd

jeanbaptdzd commited on 24 days ago

Fix reasoning tag regex to match both <think> and <think> tags

875263b

jeanbaptdzd commited on 24 days ago

Simplify reasoning tag removal: use single pattern for both tag types

b9ca306

jeanbaptdzd commited on 24 days ago

Fix reasoning tag patterns: handle <think> and <think> correctly

5a4f1e9

jeanbaptdzd commited on 24 days ago

Fix reasoning tag handling: support both <think> and <think>

28af6d2

jeanbaptdzd commited on 24 days ago

Fix JSON extraction to handle reasoning tags

ad2ecea

jeanbaptdzd commited on 24 days ago

Fix OpenAI API compatibility: support tool_choice='required' and response_format

a82e45b

jeanbaptdzd commited on 24 days ago

Add deprecation warning for clear_gpu_memory model/tokenizer parameters

92bb437

jeanbaptdzd commited on 26 days ago

Fix model ID and improve memory management

9db586c

jeanbaptdzd commited on 26 days ago

Merge feat/tool-enabling into master - resolve conflicts

192844a

jeanbaptdzd commited on 26 days ago

feat: Enable tool calls support in OpenAI API

895a63f

jeanbaptdzd commited on 26 days ago

feat: Add rate limiting, stats tracking, and fix critical issues

67befa7

jeanbaptdzd commited on 26 days ago

refactor: Enhance codebase with comprehensive improvements for CodeRabbit review

1e23279

jeanbaptdzd commited on Nov 3

refactor: Improve type hints and code quality across codebase

20548ac

jeanbaptdzd commited on Nov 3

fix: Apply CodeRabbit suggestions

fdc8bbe

jeanbaptdzd commited on Nov 3

feat: Add input validation and type hints

f28306b

jeanbaptdzd commited on Nov 3

Increase max_tokens to 1000 and request concise answers

83ffe61

jeanbaptdzd commited on Nov 2

Set DEFAULT_MAX_TOKENS=800 to prevent timeouts

bedfb0c

jeanbaptdzd commited on Nov 2

refactor: DRY improvements and optimize Dockerfile

16c2a22

jeanbaptdzd commited on Nov 2

refactor: Clean up codebase - remove obsolete files and improve documentation

6541672

jeanbaptdzd commited on Nov 2

Show complete answers in quiz + increase max_tokens to 1500

33a2ae7

jeanbaptdzd commited on Nov 2

Fix truncation: increase max_tokens and proper finish_reason

9f2572d

jeanbaptdzd commited on Nov 2

Add debug endpoint to inspect prompt generation

15ee2a4

jeanbaptdzd commited on Nov 2

Add detailed logging for chat template debugging

ef2ab5b

jeanbaptdzd commited on Nov 2

Load custom chat_template.jinja from model repository

d711c35

jeanbaptdzd commited on Nov 2

Add proper Qwen3 chat template to finance model

27930d6

jeanbaptdzd commited on Nov 2

Fix critical bugs: OOM errors, race conditions, truncation, and French language support

5ac5a91

jeanbaptdzd commited on Nov 2

Add GPU memory cleanup and fix OOM errors - cleanup cache after each inference

d31f411

jeanbaptdzd commited on Nov 2

Fix generation: increase tokens for complete answers, add EOS handling

78f67d6

jeanbaptdzd commited on Nov 2

Rename vllm.py to transformers_provider.py - clarify implementation and force rebuild

afd6869

jeanbaptdzd commited on Nov 2

Migrate from vLLM to Transformers library

9c71bb7

jeanbaptdzd commited on Nov 2

Upgrade vLLM to 0.11.0 for Qwen3ForCausalLM support

dc80161

jeanbaptdzd commited on Nov 2

Update to vLLM 0.9.2 with Qwen3 support, remove PRIIPS functionality, add HF Space validation hook

a750766

jeanbaptdzd commited on Nov 2