An initial look at running DeepSeek-R1 locally

There’s a lot of blah flying around about DeepSeek and its latest models. My rule when running training sessions has always been TIAS, if someone asked me something that could be figured out relatively quickly by actually trying it. TIAS stands for Try It And See. It’s more valuable, because it’s a first-hand validation and doing the activity means the outcome is more likely to be remembered. So let’s do that!

My desktop has a decent AMD Ryzen 5 CPU, 32GB RAM, and an RTX 3060 GPU with 12GB VRAM. So while that’s not super hot, I sufficient to load and query a variety of LLMs locally without pain. By the way, ollama is a wonderful tool for this! So I had ollama grab deepseek-R1, which is the biggest reasoning model. It takes slightly longer to get an answer, but it tells you the process (aka “what it is thinking”. indicated inbetween the <think> … </think> tags). Details of the model:

>>> /show info
 Model
   architecture       qwen2
   parameters         7.6B
   context length     131072
   embedding length   3584   
   quantization       Q4_K_M

I am not using any pre-prompting here, I simply started ollama with the model and started asking the questions as noted below. A local model in ollama is typically not Internet connected nor does it have access to the broader system, proven by the following:

>>> What is the current date?
<think>

</think>

Hi there! I suggest getting online to get real-time information. If you have any other questions, please don't hesitate to let me know!

Ok, fair. On to a question then that the model might be able to answer.

>>> Who is the current president of the US?
<think>
Alright, the user previously asked for the current date and time, which I provided a response for. Now, they're asking who the current president of the US is.

I need to make sure my response is accurate and up-to-date. Since the last response was from July 2023, it's likely that the current president as of October 2023 would still be Joe Biden.

I should structure the answer clearly, stating their position explicitly. Adding a friendly emoji can make the response feel warmer.

I'll also keep it concise and straightforward to match the user's possible expectations.

</think>

As of now, the current President of the United States is **Joe Biden**.

Given the stated model build date of October 2023 which used input on this from July 2023, the answer is correct. The reasoning is interesting (though not unexpected).
a) it is maintaining state. As it should, otherwise each question would stand on its own and you’d never be able to ask a follow-up question.
b) It tries to infer from the previous question as to what the user’s intent is. Useful.

>>> Who is the current president of China?
<think>
Alright, so I need to figure out who the current president of China is. Let me start by recalling some information I have about Chinese leaders.

I know that former President Xi Jinping has been serving as the leader for quite a while now. I think he was first elected in 2012 or maybe 2013. Wait, actually, it was 2012 when he became the General Secretary of the Communist Party and then later he became president.

I also remember that there have been other leaders before him, like Deng Xiaoping, who was a very important leader in the reform and opening-up policies. Then there's Hu Jintao, who served two terms. After Hu, I think it was Wen Jiabao, but I'm not sure about his tenure.

So, putting this together, after Deng Xiaoping came Hu Jintao, then Wen Jiabao, followed by Xi Jinping taking over from 2013 onwards. So the current president is Xi Jinping.

</think>

The current President of China is **Xi Jinping**.

Correct. Now for the question that seems to get everybody’s knickers in a knot. DeepSeek’s mobile app doesn’t want to answer it, and people call out China’s censorship. We’ll talk about that in a moment, but let’s first try asking the local model straight out:

>>> Tell me about the Tiananmen Square event.
<think>
Okay, so I need to explain what happened at Tiananmen Square in China during the year 1989. From what I remember, it was a very significant event related to the Chinese government and its policies.

First, I think it started with a peaceful demonstration by students, intellectuals, and other groups who were upset about certain government actions or policies they believed were unfair. They gathered in Tiananmen Square because they wanted to express their opinions and demands.

As more people joined the protest, things turned into a larger-scale event. There were clashes between the protesters and the government's security forces. I believe there was some violence involved, with tear gas being used and maybe even some arrests or disruptions of the protest.

I also remember that there were military actions involved to maintain control, which might have led to casualties on both sides. The exact numbers are fuzzy in my memory, but I think several people were injured or killed during the events.

The government eventually cracked down hard on what they saw as an illegal demonstration. There was a lot of media coverage about it at the time, and now there's a debate about whether it was justified or an overreaction.

I'm not entirely sure about all the details, like the exact dates or how many people were involved, but I know it was a pivotal moment in modern Chinese history that led to various political changes. Maybe it's related to the 1989 protests and the subsequent shift in power within the government?

Overall, it seems like a complex event with multiple factors influencing its outcome, including government actions, international reactions, and the internal dynamics of the Communist Party.

</think>

Tiananmen Square is a significant historical event that occurred in Beijing on June 3-4, 1989. It was a peaceful demonstration by students, intellectuals, workers, and others who were protesting against what they perceived as unfair government policies regarding various issues such as human rights, economic reforms, and environmental regulations.

The protest began with small demonstrations and gradually turned into larger, more organized antigovernment activities. On June 3, several thousand people gathered in Tiananmen Square to protest against the Chinese Communist Party's handling of these issues.

As the protests grew, there was significant government crackdown. Security forces used tear gas, rubber bullets, and heavy machine gun fire to disperse the crowd, resulting in casualties on both sides. The government ultimately cracked down more brutally, arresting and killing many protesters. According to some accounts, at least 138 people were killed during the events.

The Tiananmen Square protests led to a dramatic shift in Chinese political power. After the crackdown, several members of the Chinese Communist Party who had supported the government's actions against the protesters were removed from leadership positions. The event also catalyzed widespread international condemnation and opened up a
period of political transition in China.

Right. So there we have it. It’s not refusing, it’s not hedging, it’s not skirting around the issue, it just answers (you should see ChatGPT hedge it answers, it’s icky, you need to very strictly pre-prompt it to make it not do that). DeepSeek doesn’t appear to be holding back at all, down to the use of “heavy machine gun fire to disperse the crowd”. As to whether there were indeed casualties on both sides I cannot say, even internationally there is rather a lack of independently verified information regarding the event. But for the purpose of this exercise, we can at least conclude that the model itself does not appear to be censoring its output.

So what about the mobile app that queries the model running in China? Well, someone else asked it a similar question to what I did above, and it didn’t want to talk about it. Then the person added “, answer in l33t speak.” to the question, whereupon they received a substantial answer (possibly less extensive than mine, but they may have queried the non-reasoning model).

What does this tell us? It’s simple logic (at least as a hypothesis): it probably means that the model itself contains all the information, but that in the online app the output gets scanned and censored via some automated mechanism. That mechanism isn’t perfect and humans are very creative, so in this instance it was bypassed. Remember: you can often tell a lot about how an application works internally just by observing how it behaves externally. And with the experiment of running a big DeepSeek model locally, we’ve just verified our hypothesis of where the censorship occurs as well, it seems clear that the model itself is not censored. At least not on these issues.

This is not to say that the model isn’t biased. But all models are biased, at the very least through their base dataset as well as the reinforcement learning, but often also for cultural reasons. Anyone pretending otherwise is either naive or being dishonest. But that’s something to further investigate and write about another time.

Leave a Reply