Spaces:

Corin1998
/

Auto_PPT_Generator

Sleeping

App Files Files Community

Corin1998 commited on Sep 16

Commit

b32e168

verified ·

1 Parent(s): 792d628

Upload 4 files

Browse files

Files changed (4) hide show

README.md +28 -12
app.py +103 -0
requirements.txt +11 -0
runtime.txt +1 -0

README.md CHANGED Viewed

@@ -1,12 +1,28 @@
----
-title: Auto PPT Generator
-emoji: 😻
-colorFrom: blue
-colorTo: blue
-sdk: gradio
-sdk_version: 5.45.0
-app_file: app.py
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# Auto-PPT Generator (Hugging Face Space)
+End-to-end pipeline: Long text → Summary → Sectioning → Bullets/Tables/Charts → PPTX export, with theme color and logo.
+## Run on a Space
+1. Create a Gradio Space and upload all files.
+2. (Optional) **Settings → Variables & secrets**: add `HF_TOKEN` if you will use the Inference API.
+3. Click **Run**.
+## Models
+- **Local CPU (English):** `sshleifer/distilbart-cnn-12-6` (summarization)
+- **Local CPU (Japanese):** `sonoisa/t5-base-japanese` (use `text2text-generation` path with `要約:` prefix)
+- **Inference API:** any instruct/summarization model you have access to (e.g., `Qwen/Qwen2-7B-Instruct`, `elyza/ELYZA-japanese-Llama-2-7b-fast-instruct`).
+## Input conventions
+- **Tables:** Provide lines like `項目: 値` under a section to auto-build a 2-column table.
+- **Charts:** Provide lines like `ラベル: 123` (numbers) to auto-build a bar chart.
+- **Bullets:** Lines starting with `-`, `*`, `・`, or numbered lists are detected automatically.
+## Notes
+- Slide numbers are approximated (python-pptx lacks true auto-numbering fields).
+- For corporate fonts, pre-install them or post-process the PPTX if required.
+- For very large texts, we truncate per-model token limits but the rule-based extractors remain robust.

app.py ADDED Viewed

	@@ -0,0 +1,103 @@

+import os
+import io
+import time
+import gradio as gr
+from modules.text_processing import process_text
+from modules.pptx_builder import build_presentation
+from modules.utils import safe_hex_to_rgb, ensure_tmpdir
+APP_NAME = "Auto-PPT Generator"
+def generate_pptx(long_text: str,
+                  title: str,
+                  theme_hex: str,
+                  add_summary: bool,
+                  add_charts: bool,
+                  use_inference_api: bool,
+                  summarize_model: str,
+                  generator_model: str,
+                  max_summary_words: int,):
+    if not long_text or not long_text.strip():
+        raise gr.Error("入力テキストが空です。長文を貼り付けてください。")
+    theme_rab = safe_hex_to_rgb(theme_hex or "#3B82F6")
+    # Read logo (optional)
+    logo_bytes = None
+    if logo_file is not None:
+        logo_bytes = logo_file.read()
+    # Step 1-3: NLP pipeline(summary, sections, bullets, tables , chart data)
+    results = process_text(
+        text=long_text,
+        use_inference_api=use_inference_api,
+        summarize_model=summarize_model,
+        generator_model=generator_model,
+        want_summary=add_summary,
+        want_tables=add_tables,
+        want_charts=add_charts,
+        max_summary_words=max_summary_words,
+    )
+    # Step 4: Build PPTX
+    ensure_tmpdir()
+    timestamp = time.strftime("%Y%m%d-%H%M%S")
+    out_path = f"/tmp/auto_ppt_{timestamp}.pptx"
+    build_presentation(
+        output_path=out_path,
+        title=title or "Auto-PPT"
+        theme_rgb=theme_rab,
+        logo_bytes=logo_bytes,
+        executive_summary=result.get("summary"),
+        sections=result.get("sections", []),
+        bullets_by_section=result.get("bullets", {}),
+        tables=result.get("tables", []),
+        charts=result.get("charts", []),
+    )
+    # Return file path for download
+    return out_path
+def ui():
+    with gr.Blocks(title=APP_NAME) as demo:
+        gr.Markdown(f"# {APP_NAME}\n長文→要約→セクション分割→箇条書き/表/図→**PPTX出力**まで自動化")
+        with gr.Row():
+            with gr.Column(scale=2):
+                long_text = gr.Textbox(label="長文テキスト(貼り付け)", lines=20, placeholder="ここに長文テキストを貼り付けてください...")
+                title = gr.Textbox(label="タイトル", value="自動生成スライド")
+                theme_hex = gr.ColorPicker(label="テーマカラー", value="#3B82F6")
+                logo = gr.File(label="ロゴ画像(任意)")
+                with gr.Row():
+                    add_summary = gr.Checkbox(label="要約スライドを追加", value=True)
+                    add_tables = gr.Checkbox(label="表を検出して追加", value=True)
+                    add_charts = gr.Checkbox(label="チャートを生成して追加", value=True)
+            with gr.Column(scale=1):
+                use_inference_api = gr.Checkbox(label="Hugging Face Inference APIを使用", value=False)
+                summarize_api = gr.Textbox(label="要約モデル名(local or API)", value="sshleifer/distilbart-cnn-12-6")
+                generator_model = gr.Textbox(label="生成モデル(API推奨,任意)", value="")
+                max_summary_words = gr.Slider(label="要約の最大単語数", 50, 600, value=200, step=10)
+                generate = gr.Button("PPTXを生成", variant="primary")
+                output_file = gr.File(label="ダウンロード")
+        generate.click(
+            fn=generate_pptx,
+            inputs=[
+                long_text, title, theme_hex, add_summary, add_charts,add_tables,
+                use_inference_api, summarize_api, generator_model, max_summary_words, summarizer_model,
+            ],
+            outputs=output_file
+        )
+        gr.Markdown("""
+        **Tips**
+                    -　日本語要約には`sonoisa/t5-base-japanese`を推奨('text2text-generation').
+                    - Inference API を使う場合は、 Spaceの Secret に `HF_TOKEN` を設定してください。
+                    - チャートは'Label: 123' 形式の行を自動検出して棒グラフを作成します。
+                    """)
+    return demo
+if __name__ == "__main__":
+    app = ui()
+    app.launch()

requirements.txt ADDED Viewed

	@@ -0,0 +1,11 @@

+gradio==4.44.0
+transformers>=4.44.0
+sentencepiece>=0.1.99
+accelerate>=0.33.0
+torch>=2.2.0
+python-pptx>=0.6.23
+matplotlib>=3.8.4
+pillow>=10.2.0
+pandas>=2.2.2
+numpy>=1.26.4
+requests>=2.31.0

runtime.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ python-3.10