Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Meta

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Post History

80%
+6 −0
Meta How should I organize material about text encoding in Python into questions?

I want to write one or more self-answered Q&As on the topic of text encoding in Python, to serve as canonicals and preempt future lower-quality questions. I can think of the following things th...

3 answers  ·  posted 8mo ago by Karl Knechtel‭  ·  last activity 8mo ago by Karl Knechtel‭

#1: Initial revision by user avatar Karl Knechtel‭ · 2023-08-29T19:07:16Z (8 months ago)
How should I organize material about text encoding in Python into questions?
I want to write one or more self-answered Q&As on the topic of text encoding in Python, to serve as canonicals and preempt future lower-quality questions. I can think of the following things that need to be addressed:

* What is *an encoding*?
* What are *encoding* (the process) and *decoding*? How do I know which is which?
* Why do I need a text encoding? *When* do I need one?
* How can I know which text encoding to use?
* How can I know *if/how much freedom I have* in choosing a text encoding?
* Are encodings used for other things? Why?
* How do I specify an encoding...
  * for converting bytes to a string or vice-versa?
  * for reading and writing files?
  * when working with web libraries such as Requests, BeautifulSoup etc.?
  * when using a library to parse formats like CSV, JSON etc.?
* What is the `codecs` standard library module for, and how does it relate to text encoding?
* What are `UnicodeEncodeError` and `UnicodeDecodeError`? What do they mean; what causes them; and how do I resolve them?
* Historical: in Python 2.x, why can attempts to decode cause `UnicodeEncodeError`, and vice-versa?
* Historical / migration: how should I understand the type names `bytes`, `str` and `unicode` in 2.x vs 3.x?
* Historical: What was `basestring` in 2.x and why was it needed?
* Historical / migration: why did 2.x treat those types the way it did, and why does 3.x treat them differently? Why shouldn't I try to emulate the old approaches in new code?

There might be more that I'm forgetting.

My question here is, *how should I organize* these facets of the topic into questions? I don't think all of this material can be covered in a single post, but making things too fine-grained makes things awkward in the future - it becomes too hard to search for the right question because you find the other ones instead, and the material becomes redundant between questions.