Skip to content

Commit

Permalink
add github workflow to sync-with-huggingface
Browse files Browse the repository at this point in the history
  • Loading branch information
lqhl committed Jan 15, 2024
1 parent a515496 commit 1eb111f
Show file tree
Hide file tree
Showing 16 changed files with 69 additions and 11 deletions.
50 changes: 50 additions & 0 deletions .github/workflows/sync-with-huggingface.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
name: Sync with Hugging Face

on:
push:
branches:
- main
paths:
- .github/workflows/sync-with-huggingface.yml
- app/**

jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Sync with Hugging Face
uses: nateraw/[email protected]
with:
# The github repo you are syncing from. Required.
github_repo_id: 'myscale/ChatData'

# The Hugging Face repo id you want to sync to. (ex. 'username/reponame')
# A repo with this name will be created if it doesn't exist. Required.
huggingface_repo_id: 'myscale/ChatData'

# Hugging Face token with write access. Required.
# Here, we provide a token that we called `HF_TOKEN` when we added the secret to our GitHub repo.
hf_token: ${{ secrets.HF_TOKEN }}

# The type of repo you are syncing to: model, dataset, or space.
# Defaults to space.
repo_type: 'space'

# If true and the Hugging Face repo doesn't already exist, it will be created
# as a private repo.
#
# Note: this param has no effect if the repo already exists.
private: false

# If repo type is space, specify a space_sdk. One of: streamlit, gradio, or static
#
# This option is especially important if the repo has not been created yet.
# It won't really be used if the repo already exists.
space_sdk: 'streamlit'

# If provided, subdirectory will determine which directory of the repo will be synced.
# By default, this action syncs the entire GitHub repo.
#
# An example using this option can be seen here:
# https://github.com/huggingface/fuego/blob/830ed98/.github/workflows/sync-with-huggingface.yml
subdirectory: app
28 changes: 18 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# ChatData πŸ” πŸ“–

***We are constantly improving LangChain's self-query retriever. Some of the features are not merged yet.***

[![](https://dcbadge.vercel.app/api/server/D2qpkqc4Jq?compact=true&style=flat)](https://discord.gg/D2qpkqc4Jq)
Expand Down Expand Up @@ -38,7 +39,7 @@ To enhance your experience and seamlessly continue interactions with existing se

In addition to tapping into ChatData's external knowledge base powered by MyScale for answers, you also have the option to upload your own files and establish a personalized knowledge base. We've implemented the Unstructured API for this purpose, ensuring that only processed texts from your documents are stored, prioritizing your data privacy.

In conclusion, with ChatData, you can effortlessly navigate through vast amounts of data, effortlessly accessing precisely what you need. Whether you're a researcher, a student, or a knowledge enthusiast, ChatData empowers you to explore academic papers and research documents like never before. Unlock the true potential of information retrieval with ChatData and discover a world of knowledge at your fingertips.
In conclusion, with ChatData, you can effortlessly navigate through vast amounts of data, effortlessly accessing precisely what you need. Whether you're a researcher, a student, or a knowledge enthusiast, ChatData empowers you to explore academic papers and research documents like never before. Unlock the true potential of information retrieval with ChatData and discover a world of knowledge at your fingertips.

➑️ Dive in and experience ChatData on [Hugging Face](https://huggingface.co/spaces/myscale/ChatData)πŸ€—

Expand Down Expand Up @@ -126,8 +127,10 @@ python3 -m streamlit run app.py
```

### Where can I get those arXiv data?
- [From parquet files on S3](docs/self-query.md#insert-data)
- <a name="data-service"></a>Or Directly use MyScale database as service... for **FREE** ✨

- [From parquet files on S3](docs/self-query.md#insert-data)
- <a name="data-service"></a>Or Directly use MyScale database as service... for **FREE** ✨

```python
import clickhouse_connect

Expand Down Expand Up @@ -155,26 +158,32 @@ python3 -m streamlit run app.py

### Quickstart

1. Create an virtual environment
1. Enter directory `app/`

```bash
cd app/
```

2. Create an virtual environment

```bash
python3 -m venv .venv
source .venv/bin/activate
```

2. Install dependencies
3. Install dependencies

> This app is currently using [MyScale's technical preview of LangChain](https://github.com/myscale/langchain/tree/preview).
>>
> This app is currently using [MyScale's technical preview of LangChain](https://github.com/myscale/langchain/tree/preview).
>
>> It contains improved SQLDatabaseChain in [this PR](https://github.com/hwchase17/langchain/pull/7454)
>>
>>
>> It contains [improved prompts](https://github.com/hwchase17/langchain/pull/6737#discussion_r1243527112) for comparators `LIKE` and `CONTAIN` in [MyScale self-query retriever](https://github.com/hwchase17/langchain/pull/6143).

```bash
python3 -m pip install -r requirements.txt
```

3. Run the app!
4. Run the app!

```python
# fill you OpenAI key in .streamlit/secrets.toml
Expand All @@ -187,7 +196,6 @@ python3 -m streamlit run app.py

[*Read the full article*](https://blog.myscale.com/2023/07/17/teach-your-llm-vector-sql/)


- [Why Vector SQL?](https://blog.myscale.com/2023/07/17/teach-your-llm-vector-sql/#automate-the-whole-process-with-sql-and-vector-search)
- [How did LangChain and MyScale convert natural language to structured filters?](https://docs.myscale.com/en/advanced-applications/chatdata/#selfqueryretriever)
- [How to make chain execution more responsive in LangChain?](https://docs.myscale.com/en/advanced-applications/chatdata/#add-callbacks)
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
MYSCALE_HOST = "msc-1decbcc9.us-east-1.aws.staging.myscale.cloud"
MYSCALE_HOST = "msc-4a9e710a.us-east-1.aws.staging.myscale.cloud"
MYSCALE_PORT = 443
MYSCALE_USER = "chatdata"
MYSCALE_PASSWORD = "myscale_rocks"
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 comments on commit 1eb111f

Please sign in to comment.