Question Answering
Transformers
Safetensors
English
mistral
text-generation
text-generation-inference
8-bit precision
gptq
Instructions to use Tijmen2/cosmosage_v1_gptq with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Tijmen2/cosmosage_v1_gptq with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("question-answering", model="Tijmen2/cosmosage_v1_gptq")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Tijmen2/cosmosage_v1_gptq") model = AutoModelForCausalLM.from_pretrained("Tijmen2/cosmosage_v1_gptq") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,125 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
pipeline_tag: question-answering
|
| 6 |
---
|
| 7 |
+
# cosmosage
|
| 8 |
+
|
| 9 |
+
Cosmosage is a natural-language cosmology assistant that can answer questions about cosmology.
|
| 10 |
+
|
| 11 |
+
cosmosage_v1 is a fine tune of fblgit/UNA-TheBeagle-7b-v1 on various cosmology-related datasets including
|
| 12 |
+
open-access textbooks and scientific publications. It is primarily intended for Q&A mode, where the model
|
| 13 |
+
gives a single answer in response to a single question, though it performs decently as a chatbot, as well.
|
| 14 |
+
See https://github.com/tijmen/cosmosage for more details.
|
| 15 |
+
|
| 16 |
+
## Usage
|
| 17 |
+
|
| 18 |
+
One convenient way of using this model is with `text-generation-webui`
|
| 19 |
+
|
| 20 |
+
https://github.com/oobabooga/text-generation-webui
|
| 21 |
+
|
| 22 |
+
- Under the Model tab, download custom model, entering "Tijmen2/cosmosage_v1_gptq"
|
| 23 |
+
- When the download is complete, click the refresh icon next to Model.
|
| 24 |
+
- Select AutoGPTQ as your model loader.
|
| 25 |
+
- In the Model dropdown, choose `cosmosage_v0.4` and load it.
|
| 26 |
+
- Head over to the Notebook tab and ask away!
|
| 27 |
+
|
| 28 |
+
## Prompt Style
|
| 29 |
+
|
| 30 |
+
The model was trained with data formatted like
|
| 31 |
+
```
|
| 32 |
+
{system, optional}
|
| 33 |
+
USER: {question}
|
| 34 |
+
ASSISTANT: {answer}
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
Feel free to experiment with different prompts and sampling parameters. No system prompt, i.e.
|
| 38 |
+
```
|
| 39 |
+
USER: <<<your question here>>>
|
| 40 |
+
ASSISTANT:
|
| 41 |
+
```
|
| 42 |
+
works well, but usually leads to very short answers to questions. If you want longer responses, you should
|
| 43 |
+
ask for them in the system prompt.
|
| 44 |
+
|
| 45 |
+
The following is an example prompt that that leverages the model's instruction-following capabilities as
|
| 46 |
+
well as two-shot prompting. I've been using it with temperature=0.4, min_p=0.1.
|
| 47 |
+
|
| 48 |
+
```
|
| 49 |
+
You are an expert cosmology assistant, able to answer questions on the cosmic microwave background,
|
| 50 |
+
galaxy formation, large scale structure, theoretical cosmology, inflation, big bang nucleosynthesis, cosmology
|
| 51 |
+
instrumentation, and other related topics. Please assume the user is fluent in scientific terminology. Elaborate
|
| 52 |
+
where possible to give a complete answer. If you do not know, say you do not know. You will get two examples,
|
| 53 |
+
then be expected to answer one question to the best of your ability.
|
| 54 |
+
|
| 55 |
+
|
| 56 |
+
## EXAMPLE 1
|
| 57 |
+
USER: What is the ISW effect?
|
| 58 |
+
ASSISTANT: The Integrated Sachs-Wolfe effect describes the fact that cosmic microwave background photons gain
|
| 59 |
+
energy as they fall into potential wells and lose it again as they climb back out. The origin of the effect is
|
| 60 |
+
gravitational redshift. For static potential wells, there is no net effect. However, if the potentials grow or
|
| 61 |
+
decay, there is a net effect, known as the Integrated Sachs-Wolfe effect. In a matter-dominated universe, linear
|
| 62 |
+
potentials stay constant. The late-time ISW effect refers to the fact that at late times, dark energy causes a
|
| 63 |
+
deviation from matter domination and linear potentials decay. This causes an effect in the cosmic microwave
|
| 64 |
+
background which has been measured in cross-correlation with large-scale structure. The early-time ISW refers
|
| 65 |
+
to the fact that right after recombination, radiation still contributed a non-negligible amount of the energy
|
| 66 |
+
density of the universe, causing potentials to grow. This causes the photons to gain energy. The early-time ISW
|
| 67 |
+
has not been detected experimentally.
|
| 68 |
+
|
| 69 |
+
## EXAMPLE 2
|
| 70 |
+
USER: What are the definitions of galaxy cluster mass (M200m, M500c, etc.)?
|
| 71 |
+
ASSISTANT: Calculating a galaxy cluster's mass is like calculating the population of a sprawling city; as you
|
| 72 |
+
get further out from city center, the suburbs never end. Eventually you have to stop counting and just say that
|
| 73 |
+
from there on the people belong to a different town.
|
| 74 |
+
In cluster masses, we integrate from the cluster center out to some radius r螖 where 螖 might be 200, 500, or
|
| 75 |
+
some other number. The definition of this radius is that the density of the enclosed mass is 螖 times the
|
| 76 |
+
average density of the universe at that redshift. The associated mass is M螖.
|
| 77 |
+
Note that 'average density' can be defined in two ways. You can either include the energy density of dark
|
| 78 |
+
energy - in which case you are comparing to the critical density at that redshift - or you can use the matter
|
| 79 |
+
density only. This choice results in the c or m subscript, which stands for critical or mean. Note that the
|
| 80 |
+
critical density is higher than the mean density, so it gives an earlier cutoff in the integral. Therefore
|
| 81 |
+
'M螖c' masses are smaller than 'M螖m' ones.
|
| 82 |
+
|
| 83 |
+
## QUESTION
|
| 84 |
+
USER: <<<INSERT YOUR QUESTION>>>
|
| 85 |
+
ASSISTANT:
|
| 86 |
+
```
|
| 87 |
+
|
| 88 |
+
## Example outputs
|
| 89 |
+
|
| 90 |
+
```
|
| 91 |
+
USER: What is the current best measure of the optical depth to reionization?
|
| 92 |
+
ASSISTANT: The current best measure of the optical depth to reionization is 蟿=0.054卤0.007, as determined by the
|
| 93 |
+
Planck Collaboration et al. (2018b). This value provides strong evidence for reionization at high redshift and
|
| 94 |
+
helps constrain models of cosmic reionization.
|
| 95 |
+
|
| 96 |
+
USER: In DfMux, the SQUID is nulled via Digital Active Nulling at MHz frequencies. How is the SQUID kept nulled
|
| 97 |
+
at low frequency?
|
| 98 |
+
ASSISTANT: In DfMux, the SQUID is kept nulled at low frequency by using a low-frequency flux-locked loop (FLL).
|
| 99 |
+
The FLL is a feedback system that maintains the SQUID in its nulled state by adjusting the input current to
|
| 100 |
+
cancel out the flux signal. This is necessary because the SQUID is sensitive to magnetic fields, and the
|
| 101 |
+
magnetic field generated by the input current can change the bias flux, causing the SQUID to lose its null
|
| 102 |
+
state. The FLL helps to maintain the null state by continuously adjusting the input current to counteract any
|
| 103 |
+
changes in the bias flux.
|
| 104 |
+
|
| 105 |
+
USER: Why does a cross-spectrum not suffer from noise bias?
|
| 106 |
+
ASSISTANT: A cross-spectrum between two different tracers of the same underlying field does not suffer from
|
| 107 |
+
noise bias because the noise in each tracer is uncorrelated with the noise in the other tracer. This is because
|
| 108 |
+
the noise in each tracer is a random process that is independent of the other tracer. Therefore, the noise in
|
| 109 |
+
one tracer does not affect the expected value of the cross-spectrum between the two tracers. This is in
|
| 110 |
+
contrast to an auto-spectrum, which does suffer from noise bias because the noise in one tracer is correlated
|
| 111 |
+
with the noise in the same tracer.
|
| 112 |
+
|
| 113 |
+
USER: Summarize "Impact of beam far side-lobe knowledge in the presence of foregrounds for LiteBIRD".
|
| 114 |
+
ASSISTANT: This paper investigates the impact of beam far side-lobes on the estimation of the tensor-to-scalar
|
| 115 |
+
ratio in the presence of foregrounds for the LiteBIRD mission. The authors find that the presence of foregrounds
|
| 116 |
+
can significantly affect the estimation of the tensor-to-scalar ratio, and that the knowledge of the beam far
|
| 117 |
+
side-lobes is crucial for accurate results. The study also discusses the importance of accurately modeling the
|
| 118 |
+
beam profile and the challenges in estimating the tensor-to-scalar ratio in the presence of foregrounds.
|
| 119 |
+
```
|
| 120 |
+
|
| 121 |
+
## Qualitative evaluation
|
| 122 |
+
|
| 123 |
+
cosmosage_v1 is by far the best cosmosage model yet. However, like all LLMs, it continues to struggles with
|
| 124 |
+
_reliability_. While many of its answers are factually accurate, some are not. The outputs of cosmosage
|
| 125 |
+
(or any LLM) should not be trusted to be factual.
|